Skip to content

Conversation

chenmoneygithub
Copy link
Collaborator

Revert #8098. This extra callback is useful to interpret the actual LM prompt and response. The callback on BaseLM.__call__ only captures the processed output.

@chenmoneygithub chenmoneygithub requested a review from okhat May 5, 2025 00:15
@chenmoneygithub chenmoneygithub force-pushed the add-lm-callback-back branch from 19eb51e to 7d2f637 Compare May 5, 2025 00:17
@TomeHirata
Copy link
Collaborator

TomeHirata commented May 6, 2025

Shall we remove callback from BaseLM.__call__ then? It is counterintuitive to call the LM callback two times for each LM call, so we should change the flow a bit instead of just recovering the callback to LM class. Also, it was not consistent with DummyLM implementation.

@chenmoneygithub
Copy link
Collaborator Author

@TomeHirata Yes it's weird on the UI, so in fact I went through the same thinking process, and realized that both are useful.

BaseLM.__call__ has additional formatting logic from LM.forward(), so the traced outputs are different. I talked to Yuki about if we should make callbacks aware of the fn name besides the class name, but we realized that users may not understand the difference between LM.__call__ and LM.forward() as well.

@TomeHirata
Copy link
Collaborator

TomeHirata commented May 7, 2025

Got it, thank you for the context. I think the main problem here is that the separation between call and forward is not clear for users and it gets counterintuitive to call the on_lm_start callback twice. Is it possible to 1) add a method with (or rename forward to) a more explicit name like make_lm_call and 2) allow callback handlers to define another callback method like on_make_lm_call_start? We can mitigate the problem only with the latter but the former is a plus.

@chenmoneygithub
Copy link
Collaborator Author

@TomeHirata Yea that could help, but honestly I would not bother doing it because it turned out that users can manage to understand the double trace on the LM call either autonomously or with a simple guide when I talked to them. I will leave this to your call, if you feel double trace is too ugly, feel free to add a new hook and change MLflow side accordingly.

@TomeHirata TomeHirata self-requested a review May 7, 2025 13:47
@chenmoneygithub
Copy link
Collaborator Author

@TomeHirata @okhat Btw, we do need to merge this for now. Deeplearning.ai is specifically asking us to add back the actual LM's input/output trace.

@okhat okhat merged commit ffb415f into stanfordnlp:main May 8, 2025
3 checks passed
TomeHirata added a commit that referenced this pull request May 9, 2025
TomeHirata added a commit that referenced this pull request May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants