Skip to content

Python: Remove imprecise container steps #17493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

yoff
Copy link
Contributor

@yoff yoff commented Sep 17, 2024

Motivation

We used to have taint steps from any element of a collection to the entire collection (see here).
These are very imprecise, leading to false positives (e.g. seen here and here).
They are also at odds with how other languages treat collections, see our issue about this.

We wish to keep the semantics, that if a collection is tainted, then all elements are considered tainted. Therefor we now try to not taint collections, if we have precise information about which elements are tainted.
For a list, if an element is tainted, we do not know which one, so any read is potentially reading tainted information.
There is not much difference between the list having content and the list being tainted.
But for a dictionary, if an entry is tainted and we know which one, only reads of the appropriate key is reading tainted information. All other reads should ideally be considered safe (they used to not be). If we do not know that other keys are safe, e.g. if the collection came from an untrusted source, we can taint the collection itself, and all reads will be considered dangerous. So for collections with precise content, there is a big difference between having content and the collection being tainted.

Thus we wish to remove these imprecise taint steps for tuples and dictionaries, where we track content precisely (we keep them for lists and sets, where content is imprecise anyway).

Changes

In this PR we do the following:

  • remove tupleStoreStep and dictStoreStep from containerStep These are imprecise compared to the content being precise.
  • add implicit reads to recover taint at sinks
  • add implicit read steps for decoders to supplement the AdditionalTaintStep that now only covers when the full container is tainted.

Status:

Potential confusions:

  • A comprehension is no longer tainted even if it has tainted elements. See the taint test for Tornado for an example.
  • Dict.items is no longer tainted for a tainted dict (but Dict.values are). We could choose to change this.

Improvements:

  • Fixed FP in test_unpacking
  • Fixed FP in CleartextLogging
  • Nicer paths in NoSqlInjection test

@yoff yoff force-pushed the python/no-imprecise-container-steps-cleaned branch from 9e17962 to 060d0b4 Compare September 17, 2024 20:14
@yoff yoff force-pushed the python/no-imprecise-container-steps-cleaned branch from 31faf91 to a74474e Compare November 1, 2024 13:54
yoff added 8 commits November 13, 2024 10:32
- remove `tupleStoreStep` and `dictStoreStep` from `containerStep`
   These are imprecise compared to the content being precise.
- add implicit reads to recover taint at sinks
- add implicit read steps for decoders
  to supplement the `AdditionalTaintStep`
  that now only covers when the full container is tainted.
We now find an alert on this line as we hope to
It is not an alert for _full_ SSRF, though, since that configuration cannot handle multiple substitutions.
and adjust collection test
@yoff yoff force-pushed the python/no-imprecise-container-steps-cleaned branch from 61c551f to 4c19a43 Compare November 13, 2024 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant