--- title: AI Security and Privacy Basics: What's Safe to Share description: Practical guide to AI privacy and security for business users. What happens to your data, which providers train on it, and how to use AI tools safely. date: January 20, 2026 author: Robert Soares category: ai-fundamentals --- You paste a customer email into ChatGPT for help drafting a response. You upload a financial report to get a summary. You ask an AI assistant about your company's internal project. Should you worry about what happens to that data? The short answer: it depends on which tool you're using and how you've configured it. The privacy practices of major AI providers vary significantly, and the defaults aren't always what you'd expect. Here's what you need to know. ## The Core Question: Is Your Data Used for Training? When you interact with an AI chatbot, your conversation might be used to improve future versions of the model. This is the primary privacy concern for most business users. [According to Stanford research examining six major AI providers](https://news.stanford.edu/stories/2025/10/ai-chatbot-privacy-concerns-risks-research), all six companies employ users' chat data by default to train their models, and some keep this information indefinitely. That means when you use the free or consumer versions of ChatGPT, Gemini, or similar tools, your inputs likely contribute to training data. Pieces of your conversations could theoretically influence how the model responds to other users in the future. [Stanford researcher Jennifer King puts it directly](https://hai.stanford.edu/news/be-careful-what-you-tell-your-ai-chatbot): When asked if users should be worried about their privacy with AI chatbots, she responded "Absolutely yes." ## How Different Providers Handle Data The policies vary significantly between consumer and business tiers: ### OpenAI (ChatGPT) **Consumer (ChatGPT Free/Plus):** By default, your conversations can be used for training. You can opt out in settings, but most users don't. **Business tiers:** [OpenAI states explicitly](https://openai.com/enterprise-privacy/) that they do not train their models on data from ChatGPT Enterprise, ChatGPT Business, ChatGPT Edu, or their API platform by default. **Data retention:** [For Enterprise customers](https://openai.com/business-data/), workspace admins control how long data is retained. Deleted conversations are removed from OpenAI's systems within 30 days. API customers can request zero data retention (ZDR) for eligible endpoints. ### Anthropic (Claude) [Anthropic does not use your prompts or Claude's responses to train its generative models](https://www.sectionai.com/blog/your-privacy-guide-to-ai-chatbots) unless you explicitly opt in through feedback mechanisms or specific programs. Deleted conversations are removed from backend systems within 30 days. This makes Claude one of the more privacy-friendly options by default. ### Google (Gemini) Consumer Gemini conversations may be used for training by default. Google Workspace enterprise tiers have stronger protections. ### Meta (Meta AI) Meta AI prompts may appear on a public feed in some contexts. The privacy implications are significant for any sensitive use. ## What Data Should You Never Share? [Because conversations might be seen by human reviewers and used for training](https://www.lakera.ai/blog/chatbot-security), treat chatbots like a semi-public forum. **Never share:** - **Passwords or credentials.** Obviously. But people do this. - **Social Security numbers, credit card numbers, or financial account details.** No legitimate AI use case requires this. - **Customer personal data without consent.** GDPR and CCPA don't care that you were just "getting AI help." - **Trade secrets or intellectual property.** If you'd fire someone for emailing it to a competitor, don't paste it into a consumer AI tool. - **Medical records or health information.** [Stanford research warns](https://hai.stanford.edu/news/be-careful-what-you-tell-your-ai-chatbot) that even asking for health-related recipes could lead to classification as a health-vulnerable individual. - **Legal documents with privileged information.** Attorney-client privilege doesn't extend to AI chat logs. - **Source code for proprietary systems.** Your competitive advantage could end up in training data. **Be cautious with:** - Customer names and contact information - Internal project names and details - Financial figures and forecasts - Strategic plans and discussions - Employee information [Studies estimate](https://www.lasso.security/blog/llm-data-privacy) that 1 in 12 employee prompts contains confidential information when using unapproved public AI models. Don't be that employee. ## The Enterprise Tier Difference Business-focused AI tiers exist specifically to address these concerns. [ChatGPT Enterprise and similar offerings](https://www.reco.ai/learn/chatgpt-enterprise-security) provide: - **No training on your data.** Your conversations stay out of future model training. - **Data encryption.** AES-256 at rest, TLS 1.2+ in transit. - **Admin controls.** Workspace owners can set retention policies. - **Data residency options.** Choose where your data is stored (US, Europe, UK, Japan, and others). - **Enterprise Key Management (EKM).** Control your own encryption keys. - **Audit logs.** Track who used what. - **SSO integration.** Use your existing identity management. - **Zero data retention options.** For API users with qualifying use cases, data isn't stored at all. [According to OpenAI's enterprise documentation](https://academy.openai.com/public/clubs/admins-6o6xf/resources/data-governance-and-compliance), these protections apply to ChatGPT Enterprise, Business, Edu, Healthcare, and Teachers editions. The cost difference between consumer and enterprise tiers is significant, but so is the risk difference. ## Data Exposure Incidents This isn't theoretical. [User conversations with AI chatbots have been exposed](https://www.nowsecure.com/blog/2025/11/05/the-owasp-ai-llm-top-10-understanding-security-and-privacy-risks-in-ai-powered-mobile-applications/) in search engine results. Prompts from some AI apps have appeared on public feeds. [LLM-powered chatbots collect and store significant amounts of data](https://www.cobalt.io/blog/security-risks-of-llm-powered-chatbots) from users, including conversation logs and personal information. This data is vulnerable to theft and misuse. In mobile apps that integrate LLMs, the exposure risk increases dramatically. Messages, photos, location data, health records, and financial details can all potentially leak when AI models are involved. ## Legal Complications A notable development: In June 2025, [a U.S. federal court ordered OpenAI](https://www.datastudios.org/post/chatgpt-data-retention-policies-updated-rules-and-user-controls-in-2025) to retain all chats and uploaded files, including deleted ones, until an ongoing lawsuit concludes. This means conversations users thought were deleted are being preserved due to legal requirements. However, [ChatGPT Enterprise was explicitly excluded](https://openai.com/business-data/) from this preservation requirement, another reason enterprise tiers offer different protections. ## Best Practices for Business Use ### 1. Know Your Tier Understand whether you're using consumer or business-grade AI tools. The privacy implications are fundamentally different. If you're using free ChatGPT for work, you're likely contributing to training data. If you're paying for ChatGPT Plus, you still might be (check your settings). Enterprise and API tiers have stronger defaults. ### 2. Check Your Settings Most providers offer some opt-out options even on consumer tiers. Find them. Use them. OpenAI allows opting out of training data contribution in settings. Claude doesn't train on your data by default. Gemini has complex settings worth reviewing. ### 3. Use Approved Tools Work with your IT or security team to identify approved AI tools. Shadow AI (employees using personal accounts for work tasks) creates uncontrolled data exposure. [Organizations should establish](https://www.oligo.security/academy/llm-security-in-2025-risks-examples-and-best-practices) clear policies about which AI tools are acceptable and for what purposes. ### 4. Sanitize Before Sharing Before pasting content into any AI tool, remove or replace: - Real names with placeholders ("Client A" instead of "Acme Corp") - Specific numbers with ranges or categories - Identifying details that aren't necessary for the task You can often get equally useful AI assistance without exposing the sensitive specifics. ### 5. Don't Paste Entire Documents Instead of uploading a whole contract, describe the type of clause you need help with and paste only that section. Instead of dumping an entire codebase, share the specific function you're debugging with comments removing proprietary context. Less data in means less risk. ### 6. Consider Self-Hosted Options For highly sensitive use cases, running open-source models (Llama, Mistral) on your own infrastructure keeps data entirely in your control. The tradeoff: you manage the infrastructure, but your data never leaves your environment. ## Privacy-Preserving Technologies Emerging technologies aim to solve this problem at a technical level: [Homomorphic encryption](https://spectrum.ieee.org/homomorphic-encryption-llm) allows computing on encrypted data without decryption. In theory, you could query an AI without the AI provider ever seeing your actual data. Companies like Duality are developing practical implementations. **Differential privacy** introduces controlled noise to datasets, making it statistically difficult to extract individual data points from training. **Confidential computing** uses trusted execution environments within CPUs for processing sensitive data. These are mostly enterprise and research-grade solutions today, but they point toward a future where AI privacy concerns might have technical solutions rather than just policy ones. ## The OWASP Perspective The security community tracks AI-specific risks. [The OWASP AI/LLM Top 10](https://www.nowsecure.com/blog/2025/11/05/the-owasp-ai-llm-top-10-understanding-security-and-privacy-risks-in-ai-powered-mobile-applications/) identifies key vulnerabilities in AI-powered applications: - **Prompt injection:** Malicious inputs that hijack model behavior - **Data leakage:** Unintended exposure of training or user data - **Inadequate sandboxing:** AI systems with too much access - **Excessive agency:** AI that can take actions with insufficient controls For developers building AI features, these are essential considerations. For users, they're a reminder that AI systems have attack surfaces like any software. ## Policy Recommendations [Stanford researchers recommend](https://news.stanford.edu/stories/2025/10/ai-chatbot-privacy-concerns-risks-research) several policy changes: - **Comprehensive federal privacy regulation** for AI services - **Affirmative opt-in** for model training (not opt-out buried in settings) - **Filtering personal information** from chat inputs by default Until such regulations exist, users bear responsibility for protecting their own data. ## The Practical Takeaway AI tools are powerful. They're also potential data exposure points. **For personal use:** Be aware that consumer AI tools likely train on your data. Don't share anything you'd be uncomfortable seeing in a news article. **For business use:** 1. Use enterprise-grade tools with explicit no-training policies 2. Check and configure privacy settings 3. Establish organizational policies about acceptable AI use 4. Sanitize sensitive information before AI interactions 5. Train employees on AI privacy practices **For highly sensitive work:** Consider whether AI assistance is worth the privacy tradeoff. Sometimes the answer is no. The companies building AI tools have different incentives than you do. They want training data. You want privacy. Understanding this dynamic helps you use AI tools more safely. The technology is valuable. Use it with awareness of what you're sharing and where it might go.