fix(cli): add Copilot variant to CliProviderType enum#3
Open
fix(cli): add Copilot variant to CliProviderType enum#3
Conversation
…muggling (CWE-444) (NVIDIA#663) * fix(l7): reject duplicate Content-Length headers to prevent request smuggling Both parse_body_length() in rest.rs and try_parse_http_request() in inference.rs silently accepted multiple Content-Length headers, overwriting with the last value seen. Per RFC 7230 Section 3.3.3, a message with multiple Content-Length headers with differing values must be rejected to prevent HTTP request smuggling (CWE-444). An attacker could send conflicting Content-Length values causing the proxy and downstream server to disagree on message boundaries. Fix: - rest.rs: detect duplicate CL headers with differing values and return an error before forwarding - inference.rs: add ParseResult::Invalid variant; detect duplicate CL headers and return Invalid with a descriptive reason - proxy.rs: handle ParseResult::Invalid by sending HTTP 400 and denying the connection Closes NVIDIA#637 Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> * fix(l7): address review feedback on Content-Length smuggling defense - inference.rs: reject unparseable Content-Length values instead of silently defaulting to 0 via unwrap_or(0) - rest.rs: reject unparseable Content-Length values so a valid+invalid duplicate pair cannot bypass the differing-values check - rest.rs: fix Transfer-Encoding substring match (.contains("chunked") → split/trim exact match) to align with inference.rs and prevent false positives on values like "chunkedx" - proxy.rs: log parsing details server-side via tracing::warn and return generic "Bad Request" body instead of leaking internal parsing reasons to sandboxed code - Add tests for all new rejection paths in inference.rs and rest.rs Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> * style(l7): apply cargo fmt formatting Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> --------- Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
* refactor(l7): export evaluate_l7_request for cross-module use Make evaluate_l7_request() public so the forward proxy path can evaluate individual requests against L7 policy without going through the full relay_with_inspection() loop. Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> * fix(proxy): add L7 inspection to forward proxy path The forward proxy previously rejected all requests to endpoints with L7 rules (blanket 403), forcing clients through the CONNECT tunnel. This meant policies like read-only (allow GET, block POST) had no effect on plain http:// requests through the forward proxy. Replace the blanket rejection with actual L7 evaluation: - Query L7 config for the endpoint (same as before) - Clone the OPA engine and evaluate the request method/path - Allow if L7 policy permits, deny with 403 if enforcement is enforce - Audit mode: log but allow (matching CONNECT path behavior) - Fail-closed: deny on evaluation errors The forward proxy uses Connection: close (one request per connection), so a single evaluation suffices — no relay loop needed. Update e2e tests to validate the new behavior: - GET /allowed → 200 (L7 policy allows) - POST /allowed → 403 (L7 policy denies, enforcement: enforce) Update security-policy.md to reflect the new forward proxy L7 behavior. Closes NVIDIA#643 Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> * style(proxy): apply cargo fmt formatting Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> --------- Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
Fork PRs receive a read-only GITHUB_TOKEN that cannot push to gh-pages, causing the deploy step to fail with a 403. Skip the deploy for fork PRs while still running the docs build to validate changes compile cleanly.
…VIDIA#686) The base sandbox image installs Python via uv at /sandbox/.uv/python/*/bin/python*, but the proxy resolves binary identity via /proc/PID/exe (the real path, not the symlink). The test policy only listed /usr/bin/python* and /usr/local/bin/python*, so OPA denied the connection at L4 before L7 evaluation could run. Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
…ser (CWE-444) (NVIDIA#671) The CL/TE desynchronisation guard added in NVIDIA#663 for the REST path was not applied to the inference request parser. A request containing both Content-Length and Transfer-Encoding headers could be interpreted differently by the proxy and the upstream server, enabling HTTP request smuggling (CWE-444, RFC 7230 Section 3.3.3). Add the same rejection check and tests mirroring the REST parser coverage, including TE substring validation. Signed-off-by: latenighthackathon <latenighthackathon@users.noreply.github.com> Co-authored-by: latenighthackathon <latenighthackathon@users.noreply.github.com>
…ntire ruleset (NVIDIA#677) * fix(sandbox): handle per-path Landlock errors instead of abandoning entire ruleset A single missing path (e.g., /app in containers without that directory) caused PathFd::new() to propagate an error out of the entire Landlock setup closure. Under BestEffort mode, this silently disabled all filesystem restrictions for the sandbox. Changes: - Extract try_open_path() and classify_path_error() helpers that handle PathFd failures per-path instead of per-ruleset - BestEffort mode: skip inaccessible paths with a warning, apply remaining rules - HardRequirement mode: fail immediately on any inaccessible path - Add zero-rule safety check to prevent applying an empty ruleset that would block all filesystem access - Pre-filter system-injected baseline paths (e.g., /app) in enrichment functions so missing paths never reach Landlock - Add unit tests for try_open_path, classify_path_error, and error classification for ENOENT, EACCES, ELOOP, ENAMETOOLONG, ENOTDIR - Update user-facing docs and architecture docs with Landlock behavior tables, baseline path filtering, and compatibility mode semantics - Fix stale ABI::V1 references in docs (code uses ABI::V2) Closes NVIDIA#664 * fix(sandbox): use debug log for NotFound in Landlock best-effort mode NotFound errors for stale baseline paths (e.g. /app persisted in the server-stored policy but absent in this container) are expected in best-effort mode. Downgrade from warn! to debug! so the message does not leak into SSH exec stdout (the pre_exec hook inherits the tracing subscriber whose writer targets fd 1). Genuine errors (permission denied, symlink loops, etc.) remain at warn! for operator visibility. Also move custom_image e2e marker from /opt to /etc (a Landlock baseline read-only path) since the security fix now properly enforces filesystem restrictions. --------- Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
Missed "--from openclaw" input parameter. Without this input sandbox container didn't install "openclaw": 'Stop with: openshell forward stop 18789 eager-dragonfly '/bin/bash: line 1: openclaw-start: command not found
* feat(sandbox): add L7 query parameter matchers Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com> * fix(sandbox): decode + as space in query params and validate glob syntax Three improvements from PR NVIDIA#617 review: 1. Decode + as space in query string values per the application/x-www-form-urlencoded convention. This matches Python's urllib.parse, JavaScript's URLSearchParams, Go's url.ParseQuery, and most HTTP frameworks. Literal + should be sent as %2B. 2. Add glob pattern syntax validation (warnings) for query matchers. Checks for unclosed brackets and braces in glob/any patterns. These are warnings (not errors) because OPA's glob.match is forgiving, but they surface likely typos during policy loading. 3. Add missing test cases: empty query values, keys without values, unicode after percent-decoding, empty query strings, and literal + via %2B encoding. * fix(sandbox): add missing query_params field in forward proxy L7 request info * style(sandbox): fix formatting in proxy L7 query param parsing --------- Signed-off-by: John Myers <9696606+johntmyers@users.noreply.github.com> Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
…ution (NVIDIA#555) * perf(sandbox): streaming SHA256, spawn_blocking for identity resolution Key changes: - Replace full file read + SHA256 with streaming 64KB-buffered hash (saves 124MB allocation for node binary) - Wrap evaluate_opa_tcp in spawn_blocking to prevent blocking tokio runtime during heavy /proc I/O and SHA256 computation - Add file-based perf logging for profiling proxy latency phases Profiling data (node binary, 124MB): - Cold TOFU: ~890ms (read+hash), warm: 0ms (cache hit) - evaluate_opa_tcp: cold=1002ms, warm=11ms - OPA evaluation: 1ms - DNS+TCP connect: 166-437ms Made-with: Cursor Signed-off-by: Rafael Koike <koike.rafael@gmail.com> * refactor(sandbox): replace perf_log with tracing::debug Replace the custom file-based perf_log() helper with standard tracing::debug!() macros as requested in PR review. This removes the custom log file writes to /var/log/openshell-perf.log and routes all performance timing through the tracing framework at DEBUG level, consistent with the rest of the codebase. Made-with: Cursor * refactor(sandbox): reduce tracing to 6 key diagnostic logs Address PR review feedback: 1. Remove ~20 inner-phase timing logs, keeping only the 6 that tell the full diagnostic story: - evaluate_opa_tcp TOTAL (proxy.rs) - dns_resolve_and_tcp_connect (proxy.rs) - file_sha256 (procfs.rs) - verify_or_cache CACHE HIT / CACHE MISS / TOTAL cold (identity.rs) 2. Restore intent-describing comments that were replaced by timing logs: - "TOFU verify the immediate binary" (proxy.rs) - "Walk the process tree upward to collect ancestor binaries" (proxy.rs) - "Collect cmdline paths for script-based binary detection." (proxy.rs) - "First: scan descendants of the entrypoint process" (procfs.rs) - "Fallback: scan all of /proc in case the process isn't in the tree" (procfs.rs) - "Skip PIDs we already checked" (procfs.rs) 3. Preserve file path in file_sha256 read errors instead of losing context via into_diagnostic(). 4. Tests: 293 passed, 1 pre-existing failure (drop_privileges), 1 ignored. Made-with: Cursor * style(sandbox): apply rustfmt formatting to debug macros --------- Signed-off-by: Rafael Koike <koike.rafael@gmail.com> Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
* feat(inference): add timeout * feat(inference): fix dynamic timeout change * feat(inference): update docs * feat(inference): fix formatting
…VIDIA#687) Replace flat pty_master/input_sender/pty_request fields in SshHandler with a HashMap<ChannelId, ChannelState> so each channel tracks its own PTY resources independently. This fixes window_change_request resizing the wrong PTY when multiple channels are open simultaneously. Also fixes ioctl UB in set_winsize (pass &winsize not winsize by value) and adds warn! logging for unknown channels across all handlers. Resolves NVIDIA#543
…DIA#495) * feat(bootstrap): switch GPU injection to CDI where supported Use an explicit CDI device request (driver="cdi", device_ids=["nvidia.com/gpu=all"]) when the Docker daemon reports CDI spec directories via GET /info (SystemInfo.CDISpecDirs). This makes device injection declarative and decouples spec generation from consumption. When the daemon reports no CDI spec directories, fall back to the legacy NVIDIA device request (driver="nvidia", count=-1) which relies on the NVIDIA Container Runtime hook. Failure modes for both paths are equivalent: a missing or stale NVIDIA Container Toolkit installation will cause container start to fail. CDI spec generation is out of scope for this change; specs are expected to be pre-generated out-of-band, for example by the NVIDIA Container Toolkit. --------- Signed-off-by: Evan Lezar <elezar@nvidia.com> Co-authored-by: Piotr Mlocek <pmlocek@nvidia.com>
* feat(sandbox): switch device plugin to CDI injection mode Configure the NVIDIA device plugin to use deviceListStrategy=cdi-cri so that GPU devices are injected via direct CDI device requests in the CRI. Sandbox pods now only require the nvidia.com/gpu resource request — runtimeClassName is no longer set on GPU pods. Signed-off-by: Evan Lezar <elezar@nvidia.com>
The provider backend (registry, discovery, policy) already supported copilot, but the CLI CliProviderType enum was missing the Copilot variant. This caused 'provider create --type copilot' to be rejected and 'sandbox create -- copilot' to fail at the CLI layer.
…sic auth encoding Add two credential injection capabilities to the L7 proxy's SecretResolver: 1. Query parameter rewriting: resolve placeholder tokens in URL query parameter values (e.g. ?api_key=openshell:resolve:env:KEY) with proper percent-encoding of the resolved secret. 2. Basic Authorization header encoding: decode base64 Basic auth tokens, resolve placeholder tokens in the decoded username:password string, and re-encode before forwarding upstream. Both features operate within the existing rewrite_http_header_block flow and require no changes to the network policy file spec or proto schema. Closes NVIDIA#630
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the missing
Copilotvariant to theCliProviderTypeenum and itsas_str()match arm, completing Copilot provider support in the CLI layer.Related Issue
Follow-up to the Copilot provider addition (commit eff88b7). The provider backend (registry, discovery, policy) was fully wired, but the CLI enum was missing
Copilot, causing:provider create --type copilotto be rejected by the CLI argument parsersandbox create -- copilotto fail with "missing provider type copilot"Changes
Copilotvariant toCliProviderTypeenum incrates/openshell-cli/src/main.rsSelf::Copilot => "copilot"match arm inCliProviderType::as_str()Testing
cargo build --workspace— passescargo test -p openshell-providers— all 13 tests passsandbox create -- copilotsuccessfully infers copilot provider type and proceeds to sandbox creationChecklist