Skip to content

Conversation

@al-rigazzi
Copy link
Collaborator

@al-rigazzi al-rigazzi commented Jun 25, 2024

This PR adds a simple TorchWorker which performs inference. The worker has only been tested for direct inference (and the file test_torch_worker.py reflects that). The output transform is still not implemented, but that's something that it is not needed for the moment being.

@codecov
Copy link

codecov bot commented Jun 25, 2024

Codecov Report

Attention: Patch coverage is 32.65306% with 66 lines in your changes missing coverage. Please review.

Please upload report for BASE (mli-feature@52abd32). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...im/_core/mli/infrastructure/worker/torch_worker.py 35.41% 31 Missing ⚠️
smartsim/_core/launcher/dragon/dragonBackend.py 7.69% 12 Missing ⚠️
...e/mli/infrastructure/storage/dragonfeaturestore.py 0.00% 9 Missing ⚠️
...tsim/_core/mli/infrastructure/environmentloader.py 0.00% 6 Missing ⚠️
smartsim/_core/mli/infrastructure/worker/worker.py 68.42% 6 Missing ⚠️
smartsim/_core/mli/message_handler.py 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff               @@
##             mli-feature     #622   +/-   ##
==============================================
  Coverage               ?   63.61%           
==============================================
  Files                  ?       97           
  Lines                  ?     6690           
  Branches               ?        0           
==============================================
  Hits                   ?     4256           
  Misses                 ?     2434           
  Partials               ?        0           
Files with missing lines Coverage Δ
smartsim/_core/mli/comm/channel/channel.py 66.66% <ø> (ø)
...m/_core/mli/infrastructure/storage/featurestore.py 100.00% <100.00%> (ø)
smartsim/_core/mli/message_handler.py 75.82% <0.00%> (ø)
...tsim/_core/mli/infrastructure/environmentloader.py 0.00% <0.00%> (ø)
smartsim/_core/mli/infrastructure/worker/worker.py 80.00% <68.42%> (ø)
...e/mli/infrastructure/storage/dragonfeaturestore.py 0.00% <0.00%> (ø)
smartsim/_core/launcher/dragon/dragonBackend.py 2.32% <7.69%> (ø)
...im/_core/mli/infrastructure/worker/torch_worker.py 35.41% <35.41%> (ø)

Copy link
Contributor

@AlyssaCote AlyssaCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks so, so good! Just a couple tiny comments. No hold ups from me, though!

import dragon
from dragon import fli
except ImportError as exc:
if not "pytest" in sys.modules:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure you want to look for pytest here? might be a copy/paste mistake

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was an attempt to avoid failing tests when dragon was not available. This will be fixed once #621 will go in

Comment on lines +107 to +108
response = MessageHandler.deserialize_response(resp)
self.measure_time("deserialize_response")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question. Shouldn't we be deserializing the request here? We serialize the request but I don't see where it's deserialized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless that's actually measured duringapp_receive, and if so never mind!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think on the app side we only want to deserialize the response - is there something I'm missing?

timings.append(time.perf_counter() - interm)
interm = time.perf_counter()

print(" ".join(str(time) for time in timings))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding a custom log level e.g. "perf" so you can configure your logging output w/familiar logging interface / avoid prints

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting suggestion, thanks

@al-rigazzi al-rigazzi requested a review from ankona July 11, 2024 20:26
@al-rigazzi al-rigazzi merged commit eace71e into CrayLabs:mli-feature Jul 15, 2024
@al-rigazzi al-rigazzi deleted the fli-worker branch July 15, 2024 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants