Skip to content

Fix interaction between defaultAdditionalTaintStep and defaultImplicitTaintRead #18776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Feb 13, 2025

defaultImplicitTaintRead is described as follows:

  /**
   * Holds if taint flow configurations should allow implicit reads of `c` at sinks
   * and inputs to additional taint steps.
   */

However where we implement the functionality of defaultImplicitTaintRead (in allowImplicitRead, see the first commit), it is only applied at sinks and additional taint steps defined in the configuration. Additional taint steps defined via defaultAdditionalTaintStep are overlooked.

In Rust, one such defaultAdditionalTaintStep is for concatenation with +, and one such defaultImplicitTaintRead is for reading from reference content. As a result we've been missing flow in common cases like this:

let _ = string1 + &string2;

This PR is a draft (for now) because I'm nervous of unintended consequences, or whether this interaction was left out on purpose (for performance reasons perhaps). The change is in shared code and may affect all languages.

TODO:

  • discuss
  • review / accept test changes
    • Rust changes are positive
    • Swift changes are positive
    • JS changes to library-tests/TaintedUrlSuffix are positive, the rest I don't understand (neutral???)
    • Go TODO
    • Ruby TODO
  • DCA runs
  • change note

@geoffw0 geoffw0 added Rust Pull requests that update Rust code DataFlow Library Ruby Go javascript Pull requests that update Javascript code Swift labels Feb 13, 2025
@@ -71,7 +71,8 @@ module TaintFlowMake<
Config::isSink(node) or
Config::isSink(node, _) or
Config::isAdditionalFlowStep(node, _, _) or
Config::isAdditionalFlowStep(node, _, _, _, _)
Config::isAdditionalFlowStep(node, _, _, _, _) or
defaultAdditionalTaintStep(node, _, _)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ for the benefit of reviewers, this is the actual change, everything else is consequences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aschackmull does this looks like a reasonable change to you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was left out on purpose to avoid lazy/imprecise models. Models from e.g. MaD are expected to handle content precisely without the need for implicit reads. I'm afraid that adding this could yield unintended FP flow in all sorts of places, but I don't know for sure. It's certainly risky, and I wouldn't expect it to be necessary.

Could you elaborate on the motivating example?

let _ = string1 + &string2;

Where's the missing read step - is it from string2 to &string2 or something like that? (please excuse my ignorance of Rust semantics). If so, it would seem that there would be plenty of room to add a proper read step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we have flow from string2 to &string2 and it adds ReferenceContent. The problem is that our model for + (in defaultAdditionalTaintStep) expects there to be no content. The compiler I believe adds an implicit dereference but there's nothing in our AST representing that.

Thus, I think we want to be able to read out of a ReferenceContent just about anywhere. or we need to figure out where the implicit dereferences occur – but I suspect we don't have enough information about types in the AST for that yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to figure out where the implicit dereferences occur – but I suspect we don't have enough information about types in the AST for that yet.

If the key word in that sentence is "yet", then this sounds like the right approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hvitved do we have plans around implicit dereferences (I think that's what's going on here)? Can I write up an issue if we don't already have one?

(we can see from the tests that it's not just string arguments to + that get implicitly dereferenced, though that's a common situation)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that what we currently do for + is not precise. The + operator is sugar for invoking add on the Add trait. So what happens for + depends on the specific implementation of that trait. For String the add implementation takes a &str as its parameter. So it's not an implicit dereference, but that the function actually taking a reference as it's argument.

Once we can handle calls to trait methods, we should represent + as a trait call, and then we should be able to write a model for String::add.

@asgerf
Copy link
Contributor

asgerf commented Feb 27, 2025

I agree the comment is misleading and should be clarified. I also noticed this behaviour through trial and error and was told in person that it was intentional. But we should absolutely put the rationale for this into the comment to avoid any more engineers having to "discover" this through trial and error.

As for the change itself I don't think it will be as easy as just adding that one line, as there are taint steps for which we do not want implicit reads. The JavaScript data flow rules for arrays and promises have been fine-tuned one the assumption that we don't have these implicit reads.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 27, 2025

Thanks for the feedback. I don't have time right now, but at some point I plan to replace this PR with changes that:

  • implement a solution specific to Rust references.
  • clarify the doc comment here about what is intentionally included and excluded.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Feb 28, 2025

Doc change here: #18895

@geoffw0
Copy link
Contributor Author

geoffw0 commented Mar 3, 2025

Closing this, we have a vague plan for doing this a different way in Rust now (and an even vaguer plan for getting some of the incidental improvements we see in Swift by another route).

@geoffw0 geoffw0 closed this Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFlow Library javascript Pull requests that update Javascript code JS Rust Pull requests that update Rust code Swift
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants