sourcegraph/internal/jsonc/jsonc.go
Quinn Slack 2991e65405
remove easy-to-misuse jsonc.Normalize (#42037)
* standardize jsonc parsing, unmarshaling, and normalization on empty documents

Empty JSONC documents that had whitespace or comments were treated differently by

- jsonc.Normalize, which converted them to `{}` (an object)
- jsonc.Parse, which returned "" (which is invalid JSON)
- jsonc.Unmarshal, which silently returned without doing anything

Now, they all treat these empty JSONC documents the same (as though the JSONC document was `null`, i.e. JSON null, which is a valid JSON document).

Fixes https://github.com/sourcegraph/sourcegraph/issues/42000 because the validation (which used the jsonc.Normalize code path) now reports an error due to `null` not conforming to the JSON Schema (which expects the root to be an object).

* show nicer error message when attempting to save empty settings

User settings (and all other JSON configuration input) must be valid JSONC objects. An empty JSONC document, or one that consists solely of whitespace or comments, is converted to JSON `null`.

The previous commit fixed https://github.com/sourcegraph/sourcegraph/issues/42000. This commit presents a nicer error for that case: instead of `invalid settings: (root): Invalid type. Expected: object, given: null`, the user now sees `invalid settings: must be a JSON object (use {} for empty)`.

* remove confusing error return from JSON Schema validation functions

It would be easy for callers to assume a nil error means no problems. That assumption would be false, and it could lead to callers unintentionally accepting invalid JSON.

This removal of the error return makes the API impossible to use incorrectly.

* remove easy-to-misuse jsonc.Normalize

Our settings, site config, and external service configs are stored in the "JSONC" format, which is like JSON but with comments and trailing commas allowed. All code paths that actually read and write this format to/from the DB enforce "JSONC".

However, when validating settings and site config (but not external service config), we used an even more lenient format, which you might call "JSONC that skips past syntax errors". It allowed things like `{"a": 1, b}` (note the unquoted object key with no value).

The jsonc package confusingly supported both formats:

- jsonc.Parse and jsonc.Unmarshal supported "JSONC".
- jsonc.Normalize supported "JSONC that skips past syntax errors".

Why did we ever support "JSONC that skips past syntax errors"? I believe I implicitly made the decision to "support" it by not building a better, clearer API. (Also, VS Code supports it in its own JSON settings files. But it has a better reason for supporting it: you can edit those files to introduce syntax errors on your own disk using other editors. We, on the other hand, can prevent edits that are syntactically invalid.)

This commit remedies that past bad decision and removes jsonc.Normalize. We no longer support "JSONC that skips past syntax errors" anywhere. Because this lenient format was only used when producing validation notices, and not when actually reading/writing configuration in ways that have side effects, this should not cause problems.

The sole expected behavior difference is that if Sourcegraph is attempted to be run with syntactically invalid configuration (e.g., via `GLOBAL_SETTINGS_FILE`), then the error message printed will just report the syntax error and not other errors that can be ascertained if the syntax error is ignored. This behavior is actually preferable because the syntax error should be fixed first anyway.

Co-authored-by: Milan Freml <kopancek@users.noreply.github.com>
2022-09-26 11:21:21 +02:00

83 lines
2.4 KiB
Go

package jsonc
import (
"encoding/json"
"github.com/sourcegraph/jsonx"
"github.com/sourcegraph/sourcegraph/lib/errors"
)
// Unmarshal unmarshals the JSON using a fault-tolerant parser that allows comments and trailing
// commas. If any unrecoverable faults are found, an error is returned.
func Unmarshal(text string, v any) error {
data, err := Parse(text)
if err != nil {
return err
}
return json.Unmarshal(data, v)
}
// Parse converts JSON with comments, trailing commas, and some types of syntax errors into standard
// JSON. If there is an error that it can't unambiguously resolve, it returns the error. If the
// error is non-nil, it always returns a valid JSON document.
func Parse(text string) ([]byte, error) {
data, errs := jsonx.Parse(text, jsonx.ParseOptions{Comments: true, TrailingCommas: true})
if len(errs) > 0 {
return data, errors.Errorf("failed to parse JSON: %v", errs)
}
if data == nil {
return []byte("null"), nil
}
return data, nil
}
var DefaultFormatOptions = jsonx.FormatOptions{InsertSpaces: true, TabSize: 2}
// Remove returns the input JSON with the given path removed.
func Remove(input string, path ...string) (string, error) {
edits, _, err := jsonx.ComputePropertyRemoval(input,
jsonx.PropertyPath(path...),
DefaultFormatOptions,
)
if err != nil {
return input, err
}
return jsonx.ApplyEdits(input, edits...)
}
// Edit returns the input JSON with the given path set to v.
func Edit(input string, v any, path ...string) (string, error) {
edits, _, err := jsonx.ComputePropertyEdit(input,
jsonx.PropertyPath(path...),
v,
nil,
DefaultFormatOptions,
)
if err != nil {
return input, err
}
return jsonx.ApplyEdits(input, edits...)
}
// ReadProperty attempts to read the value of the specified path, ignoring parse errors. it will only error if the path
// doesn't exist
func ReadProperty(input string, path ...string) (any, error) {
root, _ := jsonx.ParseTree(input, jsonx.ParseOptions{Comments: true, TrailingCommas: true})
node := jsonx.FindNodeAtLocation(root, jsonx.PropertyPath(path...))
if node == nil {
return nil, errors.Errorf("couldn't find node: %s", path)
}
return node.Value, nil
}
// Format returns the input JSON formatted with the given options.
func Format(input string, opt *jsonx.FormatOptions) (string, error) {
if opt == nil {
opt = &DefaultFormatOptions
}
return jsonx.ApplyEdits(input, jsonx.Format(input, *opt)...)
}