# Sourcegraph CSRF security model
This section describes Sourcegraph's security threat model against CSRF (Cross Site Request Forgery) requests, in depth for developers working at Sourcegraph.
If you are looking for general information or wish to disclose a vulnerability, please see our [security policy](https://sourcegraph.com/security/) or [how to disclose vulnerabilities](https://sourcegraph.com/handbook/engineering/security/reporting-vulnerabilities).
- [Sourcegraph CSRF security model](#sourcegraph-csrf-security-model)
- [Living document](#living-document)
- [Prerequisites](#prerequisites)
- [Scope](#scope)
- [What is CSRF, why is it dangerous?](#what-is-csrf-why-is-it-dangerous)
- [How is CSRF mitigated traditionally?](#how-is-csrf-mitigated-traditionally)
- [Sourcegraph's CSRF security model](#sourcegraphs-csrf-security-model)
- [Diagrams](#diagrams)
- [Request delineation: API and non-API endpoints](#request-delineation-api-and-non-api-endpoints)
- [Where requests come from](#where-requests-come-from)
- [Non-API endpoints](#non-api-endpoints)
- [Non-API endpoints are generally static, unprivileged content only](#non-api-endpoints-are-generally-static-unprivileged-content-only)
- [A note about window.context](#a-note-about-windowcontext)
- [Exclusion: username/password manipulation (sign in, password reset, etc.)](#exclusion-usernamepassword-manipulation-sign-in-password-reset-etc)
- [Risk of CSRF attacks against our non-API endpoints](#risk-of-csrf-attacks-against-our-non-api-endpoints)
- [How we protect against CSRF in non-API endpoints](#how-we-protect-against-csrf-in-non-api-endpoints)
- [API endpoints](#api-endpoints)
- [All mutable and privileged actions go through Sourcegraph's API endpoints](#all-mutable-and-privileged-actions-go-through-sourcegraphs-api-endpoints)
- [Authentication in API endpoints](#authentication-in-api-endpoints)
- [How browsers authenticate with the API endpoints](#how-browsers-authenticate-with-the-api-endpoints)
- [How we protect against CSRF in API endpoints](#how-we-protect-against-csrf-in-api-endpoints)
- [Known issue](#known-issue)
- [Improving our CSRF threat model](#improving-our-csrf-threat-model)
- [Eliminate the username/password manipulation exclusion](#eliminate-the-usernamepassword-manipulation-exclusion)
# Living document
This is a living document, with a changelog as follows:
* Aug 13th, 2021: [@slimsag](https://github.com/slimsag) does an in-depth analysis & review of our CSRF threat model and creates this document.
* Nov 8th, 2021: [@slimsag](https://github.com/slimsag) audited all potential instances of pre-fetched content embedded into pages and found we have none, the following is NOT true ([#27236](https://github.com/sourcegraph/sourcegraph/pull/27236)):
* "Some Sourcegraph pages pre-fetch content: on the backend, data is pre-fetched for the user so that they need not make a request for the data corresponding to the page immediately upon loading it. Instead, we fetch it and embed it into the `GET` page response, giving JavaScript access to it immediately upon page load."
* Nov 8th, 2021: [@slimsag](https://github.com/slimsag) adjusted CORS handling to forbid cross-origin requests on all non-API routes. ([#27240](https://github.com/sourcegraph/sourcegraph/pull/27240), [#27245](https://github.com/sourcegraph/sourcegraph/pull/27245)):
* Non-API routes, such as sign in / sign out, no longer allow cross-origin requests even if the origin matches an allowed origin in the `corsOrigin` site configuration setting.
* The `corsOrigin` site configuration setting now only configures cross-origin requests for _API routes_ (nobody should ever need a cross-origin request for non-API routes.)
* Nov 16th, 2021: [@slimsag](https://github.com/slimsag) and the Security team audited `window.context` to identify if it included any sensitive information that could be of risk.
* We found no risk in this data, except for its inclusion of CSRF tokens.
* JSContext is embedded in the content of HTML pages on GET requests and included CSRF tokens. This meant sensitive, unique user data was present in GET requests that were thought to otherwise be static pages. It is likely that we had a caching vulnerability here in which user A's GET request would be cached (e.g. by an intermediary CDN) and user B's request would use User A's cached CSRF token to perform their subsequent requests. However, since we already relied on browser CORS policies and our CSRF tokens were only a secondary means of security, this was not a real vulnerability. It did however illustrate the importance of simplifying our CSRF threat model.
* Nov 16th, 2021: [@slimsag](https://github.com/slimsag) removed our CSRF security tokens/cookies entirely, instead having Sourcegraph rely solely on browser's CORS policies to prevent CSRF attacks. [#7658](https://github.com/sourcegraph/sourcegraph/issues/7658)
* In practice, this is just as safe and leads to a simpler CSRF threat model which reduces security risks associated with our threat model complexity.
* This fixed the theoretical caching vulnerability with CSRF tokens mentioned in the prior bullet point. This was not a real vulnerability, but shows another example of why removing our CSRF tokens was the right choice to reduce complexity and ensure our CSRF threat model is solid and well understood.
* Dec 6th, 2021: [@slimsag](https://github.com/slimsag) enabled public usage of our API routes.
* Previously, only trusted origins (e.g. including those in the site config `corsOrigin` setting) were allowed to issue requests to API routes.
* Now, any origin is allowed to issue requests to our API routes and, assuming they pass the authentication layer, will reach the GraphQL backend.
* Any origin is allowed to send credentials and cookies to our API routes, e.g. session cookies and access tokens via basic auth.
* Only if the request came from a trusted origin will session cookies that came in with the request be respected.
* Requests from untrusted origins will NEVER have their session cookies respected, i.e. the request will be served as if an unauthenticated user (unless it includes an access token with the request.) This is the linchpin which ensures we are still protected against CSRF in our API routes.
# Prerequisites
## Scope
Our CSRF threat model begins and ends at the `sourcegraph-frontend` layer. This is the service in which all HTTP requests reaching Sourcegraph, be they a user's web browser, or via our API, ultimately go through.
This does not cover additional load balancers, proxies, CDNs, etc. that one may put in front of Sourcegraph:
* Some of our customers choose to place Sourcegraph behind nginx or apache, which may offer additional layers of security.
* For Sourcegraph.com we place Sourcegraph behind Cloudflare and it's WAF, for additional security, rate limiting, etc.
* For managed instances, we place Sourcegraph behind Google GCP's Cloud Load Balancer and Cloud Armor. [details here](https://github.com/sourcegraph/security-issues/issues/158#issuecomment-867038398)
## What is CSRF, why is it dangerous?
See also: [OWASP: CSRF](https://owasp.org/www-community/attacks/csrf)
CSRF (Cross Site Request Forgery) is when a legitimate user is browsing another site, say either attacker.com (ran by a malicious actor), or google.com (a legitimate site, perhaps running code by a malicious actor) makes requests to your own site, say sourcegraph.com, and is able to perform actions on behalf of the user that they did not intend to, using their own authentication credentials—often unbeknownst to them.
This can happen in *many* forms:
* GET, POST, etc. HTTP requests made via JavaScript
* GET HTTP requests made by `` tags requesting images
* POST HTTP requests made via HTML `