When fixing an unhandled error from the telemetry dashboard, the issue typically contains an error message, a stack trace, hit count, and affected user count.
Approach
- Do NOT fix at the crash site
The error manifests at a specific line in the stack trace, but the fix almost never belongs there. Fixing at the crash site (e.g., adding a typeof guard in a revive() function, swallowing the error with a try/catch, or returning a fallback value) only masks the real problem. The invalid data still flows through the system and will cause failures elsewhere.
- Trace the data flow upward through the call stack
Read each frame in the stack trace from bottom to top. For each frame, understand:
-
What data is being passed and what is expected
-
Where that data originated (IPC message, extension API call, storage, user input, etc.)
-
Whether the data could have been corrupted or malformed at that point
The goal is to find the producer of invalid data, not the consumer that crashes on it.
- When the producer cannot be identified from the stack alone
Sometimes the stack trace only shows the receiving/consuming side (e.g., an IPC server handler). The sending side is in a different process and not in the stack. In this case:
-
Enrich the error message at the consuming site with diagnostic context: the type of the invalid data, a truncated representation of its value, and which operation/command received it. This information flows into the error telemetry dashboard automatically via the unhandled error pipeline.
-
Do NOT silently swallow the error — let it still throw so it remains visible in telemetry, but with enough context to identify the sender in the next telemetry cycle.
-
Consider adding the same enrichment to the low-level validation function that throws (e.g., include the invalid value in the error message) so the telemetry captures it regardless of call site.
- When the producer IS identifiable
Fix the producer directly:
-
Validate or sanitize data before sending it over IPC / storing it / passing it to APIs
-
Ensure serialization/deserialization preserves types correctly (e.g., URI objects should serialize as UriComponents objects, not as strings)
Example
Given a stack trace like:
at _validateUri (uri.ts) ← validation throws at new Uri (uri.ts) ← constructor at URI.revive (uri.ts) ← revive assumes valid UriComponents at SomeChannel.call (ipc.ts) ← IPC handler receives arg from another process
Wrong fix: Add a typeof guard in URI.revive to return undefined for non-object input. This silences the error but the caller still expects a valid URI and will fail later.
Right fix (when producer is unknown): Enrich the error at the IPC handler level and in _validateUri itself to include the actual invalid value, so telemetry reveals what data is being sent and from where. Example:
// In the IPC handler — validate before revive
function reviveUri(data: UriComponents | URI | undefined | null, context: string): URI {
if (data && typeof data !== 'object') {
throw new Error([Channel] Invalid URI data for '${context}': type=${typeof data}, value=${String(data).substring(0, 100)});
}
// ...
}
// In _validateUri — include the scheme value
throw new Error([UriError]: Scheme contains illegal characters. scheme:"${ret.scheme.substring(0, 50)}" (len:${ret.scheme.length}));
Right fix (when producer is known): Fix the code that sends malformed data. For example, if an authentication provider passes a stringified URI instead of a UriComponents object to a logger creation call, fix that call site to pass the proper object.
Guidelines
-
Prefer enriching error messages over adding try/catch guards
-
Truncate any user-controlled values included in error messages (to avoid PII and keep messages bounded)
-
Do not change the behavior of shared utility functions (like URI.revive ) in ways that affect all callers — fix at the specific call site or producer
-
Run the relevant unit tests after making changes
-
Check for compilation errors via the build task before declaring work complete