Skip to content

Implemented a visitor class for codet #1789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Degiorgio
Copy link
Contributor

Visitor for codet, In brief, the class allows you to associate code ID type (example: ID_function_call) with a callback. When executed (by calling visit_symbols) the visitor leverages expr_visitort to explore the given symbol table and issue a callback for all codet's that it knows about.

Example of use:

void java_bytecode_languaget::convert_threadblock(codet &code,
  symbol_tablet &symbol_table)
{
  PRECONDITION(code.get_statement()==ID_function_call);
  // do something here
}
code_visitort thread_block_visitor;
thread_block_visitor.subscribe(ID_function_call,
    &java_bytecode_languaget::convert_threadblock,
    this,
    std::placeholders::_1,
    std::ref(symbol_table));
  thread_block_visitor.visit_symbols(symbol_table);

In this example, all visitor will execute java_bytecode_languaget::convert_threadblock for every ID_function_call in the symbol_table and pass along the appropriate parameters.

Note: subscribe can be called multiple times (with different IDs)
Note': the commit, does not include an example of how this is used (this will be coming along in a different pr which will add support for java concurrency)

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, couple of nitpicks

/// \param id: expr: expression
void operator()(exprt &expr) override
{
if(expr.id() != ID_code || !expr.is_not_nil())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Condition is redundant -- nil expressions have id() == ""

cmapt call_map;
};

/// \brief codet visitor, leverages expression visitor to visit all the codet's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language nitpick: s/leverages/uses/

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What template/documented idea of "visitor" is this following? It neither uses Gang-of-Four naming conventions nor are (the equivalents of) visit() calls very efficient (as they require going through a std::map).

Here is what needs to happen:

  • If it's your own design, use your own terminology and clearly steer away from existing one.
  • If it isn't a completely novel design, reference your source.
  • Tests are strictly required.
  • Explain when and when not to use this functionality.

@Degiorgio
Copy link
Contributor Author

@tautschnig regarding

"Explain when and when not to use this functionality"

should this be manifested as comments in the header file?

@Degiorgio
Copy link
Contributor Author

Degiorgio commented Feb 2, 2018

@tautschnig This is not a classic visitor pattern. As This would be a bit of pain to implement properly, at least for codet, it extends the behaviour of the existing expression visitor.

Pattern wise, it is inspired from the observable-pattern, with the difference that observers are registered through a callback instead of inheriting from some IObserve (Makes things simpler for our purposes).

Agreed the terminology I used is bit confusing, I will update the code to make it clearer, I will also add tests.

@tautschnig
Copy link
Collaborator

"Explain when and when not to use this functionality"
should this be manifested as comments in the header file?

I think comments in the code are probably best. The main reason this comment of mine even came up is that you are introducing new functionality that isn't used anywhere. It's a new feature that isn't user-visible, so you need to tell developers what to do.

@Degiorgio
Copy link
Contributor Author

Degiorgio commented Feb 2, 2018

@tautschnig , latest commit should address the issues you raised, can you confirm please? (tests incoming)

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A unit test is required as bare minimum. Various other fixes as indicated in comments.

{
static_assert(std::is_base_of<codet, T>::value, "T is not derived from codet");
public:
virtual ~code_observer_baset() {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might use =default; ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, assumed you meant " = default; " (new formating rules)

public:
virtual ~code_observer_baset() {}

/// registers an observer, that is associated with specific codet id.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pick: no comma before "that" (as a general rule) - also applies elsewhere in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// \param f: observer callback, that is associated with the given, \p id
/// \param args: arguments associated with the given callback, \p f
template<class F, class ...Args>
void registerObserver(const irep_idt &id, const F &&f, Args&&... args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we indent here. No CaMelcaSe, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
private:
using observerst = std::function<void(T&)>;
using code_observer_mapt = std::map<const irep_idt, const observerst>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe a std::unordered_map would be preferable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed done

code_observer_mapt code_observers_map;
};

/// \brief codet observer, uses expression visitor to visit all the codet's
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment provides an implementation detail rather than focusing on the big picture. Write documentation by working backwards from the potential user, not for those editing the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

public code_observer_baset<const codet>
{
public:
virtual ~const_code_observert() {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary as implicit due to inheritance from code_observer_baset

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

{
forall_symbols(symbol, symbol_table.symbols)
{
symbolt &w_symbol=symbol_table.get_writeable_ref(symbol->second.name);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lookup while already iterating over the map! This code should be:

Forall_symbols(it, symbol_table.symbols)
{
  it->second.value.visit(*this);
}

Copy link
Contributor Author

@Degiorgio Degiorgio Feb 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Forall_symbols" does not exist anymore.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry:

for(auto &sym_entry : symbol_table.symbols)
{
  sym_entry.second.value.visit(*this);
}

(and someone better clean up symbol_table_baset, but that's not for this PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tautschnig symbol_table.symbols is constant, so I would need to cast it, which seems ugly to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to raise this with @NathanJPhillips or @smowton. I don't recall making or soliciting any of those changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't actually introduce a very essential feature, does it? I'm finding it hard to see the urgency for a half-ready piece of work. The usual approach to visiting various expressions was:

void do_something_rec(const exprt &expr)
{
  if(expr.id() == ID_code && to_code(expr).get_statement() == ID_foo)
    found_it();
  
  forall_operands(it, expr)
    do_something_rec(*it);
}

void do_something(const symbol_tablet &symbol_table)
{
  for(const auto &sym_entry : symbol_table.symbols)
    do_something_rec(sym_entry.second.value);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR introduces a utility that help you avoid such boilerplate every time. That's arguably something desirable (but I agree that it's not absolutely essential (but for the same reason constructs such as functions or classes are not essential either)).

The urgency doesn't stem from the PR itself but from our time constraints in the team ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully understand that there are time constraints. But putting in a half-baked feature that thus only half-solves problems is a very questionable sort of efficiency. Yes, the boilerplate isn't great. But it's not half-baked, it is simple, and not impacted by people fiddling with APIs. And it's not the first of it's kind. All of this boilerplate code ought to be replaced by a decent visitor. All of them, not half of them. I really don't want to see another depth_iteratort with its current 2 users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do a symbol_tablet iterator quickly that had the semantics of a const_iterator and so doesn't break sharing/journalling but also included a get_writeable method that could be called when an element was encountered that you wanted to write to. Is that what you want? You could build a visitor on top of that, though it would be an unusual one as it would need to implement two callbacks, one that checked whether it wanted to write to the symbol and another that got a writable symbol.

Copy link
Contributor Author

@Degiorgio Degiorgio Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NathanJPhillips we could do that, and then in the non-const version of the code observer/visitor, when the callback is issued along with a reference to a const codet you will get a reference to the symbol_tabelt iterator which will allows you to call get_writeable ref. What do you think? This will avoid the extra look-up but you would still need to walk-down the tree

{
const symbolt &w_symbol=symbol_table.lookup_ref(symbol->second.name);
w_symbol.value.visit(*this);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, except now as the const version:

forall_symbols(it, symbol_table.symbols)
{
  it->second.value.visit(*this);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if(expr.id() != ID_code)
return;
notify(to_code(expr));
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy&paste all fixes from above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

};
};

#endif // CPROVER_UTIL_CODE_VISITOR_H
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Degiorgio Degiorgio force-pushed the codet-visitor branch 2 times, most recently from 634e68e to c581758 Compare February 5, 2018 16:16
@Degiorgio
Copy link
Contributor Author

@tautschnig , should address all of the issues you raised (including tests)

public code_observer_baset<codet>
{
public:
/// iterates through a given symbol table and for every symbol calls it's
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit pick: s/it's/its/

{
symbolt &w_symbol = symbol_table.get_writeable_ref(symbol->second.name);
w_symbol.value.visit(*this);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment a few minutes ago to git rid of the duplicate lookup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tautschnig, the "Forall_symbols" macro has been removed due to journalling-symbol-ttable

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1789 (comment):

for(auto &sym_entry : symbol_table.symbols)
{
  sym_entry.second.value.visit(*this);
}

code = codet(ID_skip);
replaced += 1;
}
int replaced;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want a constructor to initialise this to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this one purpose, to make it clear what the value of replaced is in the test, but will change if you prefer

REQUIRE(code.get_statement() == ID_decl);

code = codet(ID_skip);
replaced += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not ++replaced;?

{
REQUIRE(code.id() == ID_code);
REQUIRE(code.get_statement() == ID_skip);
count += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++count;

add_codet(codet(ID_assign), "ASSIGN1", symbol_table);

decl_replacert obj;
obj.replaced = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RAII.

const codet& code = to_code(it->second.value);
REQUIRE(code.get_statement() != ID_decl);
if(code.get_statement() == ID_skip)
num_skips += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@Degiorgio Degiorgio force-pushed the codet-visitor branch 3 times, most recently from c5e70cb to 0a2fb4f Compare February 6, 2018 10:58
Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the symbol table should be fixed to complete the other half of this implementation (@Degiorgio, please raise an issue for doing this), but I wouldn't make this a condition for getting this PR in. @Degiorgio is saying that the upcoming PRs heavily use this facility and using recursion boilerplate for that would clutter the code. It seems much less work to complete the implementation of the observer (which is a considerable amount of work) in a second PR and effortlessly switch over to the upgraded observer implementation, than redoing everything with custom recursions and changing them back to observers again at a later point in time.

///
/// const_code_observert code_observer;
/// code_observer.register_observer(ID_function_call,
/// &A::exmaine_function_calls,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

examine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// \endcode
///
/// Note': each unique irep_idt can only be associated with one observer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a note why this doesn't support modifying symbol table entries and what the impact of using this observer for modifying symbol table entries is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@tautschnig
Copy link
Collaborator

It seems much less work to complete the implementation of the observer (which is a considerable amount of work) in a second PR and effortlessly switch over to the upgraded observer implementation, than redoing everything with custom recursions and changing them back to observers again at a later point in time.

Up to you, feel free to override my review. I'd just ask not to create another non-trivial yet hardly used infrastructure -- depth_iteratort still pains me.

@peterschrammel
Copy link
Member

I'd be the last one advocating against uniformity. I see this PR as a step towards replacing these custom iterations by proper patterns. I would be more worried if we introduced parallel infrastructure that does the same thing, and none of them does it well in the end. If there is path towards completing this implementation and eventually making it the standard way of doing things then I'm fine with it. If we think that this will never be the standard way of doing things then we shouldn't introduce this facility.

@tautschnig
Copy link
Collaborator

Suggested plan of action:

  1. @NathanJPhillips and @smowton provide input on when the symbol table API can be fixed.
  2. Assuming the answer isn't "now", create an issue to track the need to fix it and any possible adjustments to the observer code introduced in this PR.
  3. Merge this PR with a TODO linking to the above issue.

@Degiorgio
Copy link
Contributor Author

@peterschrammel done

@tautschnig tautschnig dismissed their stale review February 6, 2018 16:36

Management over to others - I've loudly expressed all concerns, and the current version is free from problems.

@NathanJPhillips
Copy link
Contributor

This would be my suggestion for a symbol_tablet iterator: NathanJPhillips@143a8bc
Note that this is untested code (I haven't even compiled it) - just a sketch of how it should look.
This would enable you to create a visitor that took a const symbolt &, returned a bool to say whether it wanted to modify it and then if it returned true had another function that took a symbolt &.

@tautschnig
Copy link
Collaborator

@NathanJPhillips Thanks a lot for the quick turnaround!

@NathanJPhillips
Copy link
Contributor

I've provided the two PR's above to create the infrastructure you need. You should now be able to iterate over the symbol table then depth-first iterate over discovered expressions all using constant references. When you finally find something you want to change you can then call get_writable on the symbol_table iterator and pass the returned non-const exprt reference to get_mutable on the constant depth iterator. Then call mutate on each iteration position you wish to change using the returned non-const depth iterator. This will record the symbols you modify correctly and break sharing on just the paths from the root expression to the expressions you mutate. It will also not require lookups on the symbol table to get the non-const symbol_table iterator nor re-iteration of the exprt hierarchy to get the non-const depth iterator. Hope this helps.

@Degiorgio
Copy link
Contributor Author

@NathanJPhillips , great, thanks! will modify the code_observer to make use of the new api's.

@tautschnig
Copy link
Collaborator

Is it worth re-adding the non-const version now?

@Degiorgio
Copy link
Contributor Author

@tautschnig yes, will add the non-const version with new API's Nathan did.

@NlightNFotis
Copy link
Contributor

Hello.

I am going forward with closing this PR as it appears stale and no longer congruent with the current state of the repository. If you would like to see this PR through in any case, feel free to reopen the PR and rebase on top of branch develop to bring in a more up-to-date state, and ping any of the repository contributors and maintainers for reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants