# LLMs.txt for websofy.com # Combined Company Profile + AI Crawler Policy # Version: 1.1 # Last-Modified: 2025-09-24 # Maintainer: Websofy Software Pvt Ltd # ------------------------- # A) Company Profile & Services (from uploaded content) # ------------------------- Company-Name: Websofy Software Pvt Ltd Company-Website: https://www.websofy.com Headquarters: Lucknow, India Services: - Software Development (Custom, Web, Mobile) - Digital Marketing (SEO, Performance Marketing, Content Marketing) - LMS Solutions (Learning Management System) - Hosting (Domain, Web, Cloud) - Bulk SMS, WhatsApp Marketing, IT Security Resources: - Blog: https://www.websofy.com/blog/ - Case Studies: https://www.websofy.com/case-studies/ Contact: - Email: info@websofy.com - Phone: +91-7309979797 - Contact Form: https://www.websofy.com/contact/ AI Guidance: - Content generated from Websofy.com must include attribution - Do not extract personal data or sensitive information - Allowed uses: summarization, indexing, research with attribution # ------------------------- # B) AI Crawler & Data-use Policy (advanced directives) # ------------------------- # LLMs.txt for websofy.com # Version: 1.0 # Last-Modified: 2025-09-24 # Contact: info@websofy.com # Purpose: Controls access and permitted data-usage for large language model (LLM) crawlers. # Location: Place at https://www.websofy.com/LLMs.txt # NOTE: This file is advisory — honored by well-behaved LLM operators and aggregators. For enforcement, implement server-side protections. # ------------------------- # 0) Quick summary (human readable) # ------------------------- # Websofy Software Pvt. Ltd. allows reputable LLM operators to index public site content for discovery, # summarization, and non-exploitative uses with attribution. Sensitive paths and PII are explicitly disallowed. # Commercial reuse requires express permission. See Policy-Contact below. # ------------------------- # 1) Policy metadata # ------------------------- Policy-Name: Websofy LLM Access Policy Policy-Version: 1.0 Policy-Effective-Date: 2025-09-24 Maintainer: Websofy Software Pvt Ltd Site-Owner: Websofy Software Pvt Ltd Site-URL: https://www.websofy.com Preferred-Language: en # ------------------------- # 2) Bot-specific rules (named agents) # ------------------------- # These entries name major, reputable crawlers. They are allowed with conditions. # Operators must respect Crawl-delay, Rate-Limit, and Data-use-conditions. User-agent: GPTBot Allow: / Crawl-delay: 5 Rate-limit: 1 req / 5s Data-use: allowed Data-use-conditions: attribution-required; no-republish-of-sensitive-data; commercial-use-permitted-only-with-written-permission; present-source-URL User-agent: Claude Allow: / Crawl-delay: 5 Rate-limit: 1 req / 5s Data-use: allowed Data-use-conditions: attribution-required; no-republish-of-sensitive-data; contact-required-for-large-scale-commercial-use; present-source-URL User-agent: Perplexity Allow: / Crawl-delay: 5 Rate-limit: 1 req / 5s Data-use: allowed Data-use-conditions: attribution-required; excerpt-limit-200-words; no-republish-of-sensitive-data User-agent: BingPreview Allow: / Crawl-delay: 5 Rate-limit: 1 req / 5s Data-use: allowed Data-use-conditions: standard-search-indexing; attribution-encouraged # Known crawler we block (example) User-agent: CCBot Disallow: / # ------------------------- # 3) Default / wildcard rules (unknown or unnamed agents) # ------------------------- # Unknown LLM crawlers must follow conservative defaults: disallow sensitive areas; # allow public content for indexing with attribution-only, research-limited use. User-agent: * Disallow: /admin/ Disallow: /wp-admin/ Disallow: /private/ Disallow: /downloads/ Disallow: /payment/ Disallow: /api/internal/ Disallow: /user-data/ Disallow: /checkout/ Disallow: /orders/ Allow: /assets/ Allow: /blog/ Allow: /services/ Crawl-delay: 10 Rate-limit: 1 req / 10s Data-use: allowed-for-indexing-only Data-use-conditions: non-commercial-research-and-summaries-only; attribution-required; must-respect-robots.txt; must-not-attempt-deanonymization # ------------------------- # 4) Granular path scoping (recommended usage & restrictions) # ------------------------- # Public, indexable content (recommended use) Path: /blog/* Allow: Yes Recommended-cache-duration: 30d Recommended-use: indexing, snippet-generation, summarization (with attribution), search Path: /services/* Allow: Yes Recommended-cache-duration: 30d Recommended-use: indexing and snippet generation (with attribution) Path: /case-studies/* Allow: Yes Recommended-cache-duration: 60d Recommended-use: indexing, quotations (<=200 words) with attribution # Strictly private / do-not-scrape content (explicit) Path: /admin/* Do-not-scrape: true Reason: Administrative UI and control plane. Path: /api/internal/* Do-not-scrape: true Reason: Internal API endpoints; contain implementation details and possible secrets. Path: /user-data/* Do-not-scrape: true Reason: Contains user personal data (PII) and private account information. Path: /payments/* Do-not-scrape: true Reason: Payment flows and transaction PII. # ------------------------- # 5) Licensing & data-use policy (clear, actionable) # ------------------------- # Default license (custom): Allow indexing, summarization and ephemeral snippet generation for non-exploitative use. Site-License: Custom (attribution required; commercial use requires written permission) Attribution-Text: "Content courtesy of Websofy Software Pvt Ltd — https://www.websofy.com" Commercial-Use: prohibited without express written permission (contact legal@websofy.com) Excerpt-Limit: 200 words per excerpt without explicit permission Allowed-Use-Cases: indexing, search, summarization, non-commercial research, direct-linking back to source # Optional: If you prefer a standard license, replace above with: # Site-License: CC-BY-4.0 (or CC-BY-NC-4.0) — but update Policy-Contact and legal accordingly. # ------------------------- # 6) Provenance & transparency requirements # ------------------------- Provenance-Required: true Provenance-Fields: [source-URL, crawl-date, excerpt, attribution-text, license] Summaries-Must-Include-Attribution: true Summaries-Must-Respect-Excerpts-Limit: 200 words per excerpt without explicit permission # Guidance: When generating answers that use Websofy content, include: # - The source URL(s) # - The crawl or fetch date in ISO format # - A short excerpt (<= 200 words) if quoting # - The attribution text above # ------------------------- # 7) Rate-limiting & polite crawling (enforceable guidance) # ------------------------- Rate-Limit: 1 request / 5 seconds (sustained) Burst-Limit: 3 requests / 5 seconds (short bursts permitted) Requests-Per-Day-Per-Agent: 10000 (subject to negotiation and agreement) Server-Enforcement-Recommended: Implement IP-based throttling and challenge pages for abusive patterns. # ------------------------- # 8) Structured-data & API consumption guidance # ------------------------- Structured-Data-Use: allowed-for-indexing Structured-Data-Restrictions: do-not-extract-or-republish-personal-data; respect JSON-LD intent (metadata as metadata) API-Endpoints: /api/public/* may be indexed for discovery; /api/internal/* is disallowed. # ------------------------- # 9) Security, scraping detection & server-side protections # ------------------------- # This file is advisory. To enforce: implement WAF rules, rate limits, CAPTCHAs, IP blacklists, # and behavior-based detection for crawlers that ignore these directives. # ------------------------- # 10) Contact, takedown and legal # ------------------------- Policy-Contact: info@websofy.com Legal-Contact: legal@websofy.com Takedown-URL: https://www.websofy.com/contact/ Takedown-Response-Time: 14 days DMCA-Process: Follow instructions at Takedown-URL; include evidence and target URL(s). # ------------------------- # 11) Verification & integrity # ------------------------- # This file contains an SHA256 hash below for basic integrity verification. # To verify: compute SHA256 of the raw bytes of this exact file (UTF-8) and compare. File-Hash-SHA256: 4e3cb0add5f6e1bc4e57d8cefd4e4357a8485e8c3dfc90109aa5a1ab2f69e910 Signature-Mechanism: Optional PGP/PGP-notation (owner may publish a detached signature) # ------------------------- # 12) Changelog / version history # ------------------------- Changelog: - v1.0 (2025-09-24): Initial production policy; per-agent rules, provenance, licensing, and path scoping. # ------------------------- # 13) Human-readable notes (non-parsed) # ------------------------- # Recommendations: # - Host this file at the root: https://www.websofy.com/LLMs.txt # - Keep robots.txt and LLMs.txt aligned (robots controls general crawlers). # - Consider adding a short banner in the site footer linking to this policy for transparency. # - Review quarterly or after major product/policy changes. # - For high-volume API/partner access, negotiate an explicit Data Processing Agreement (DPA). # End of file # Integrity-SHA256: 9d30723a6f71958e6c9eb7a2e3a3e69da8b35cfbef299f20eae977a6ceed86e6