-
Notifications
You must be signed in to change notification settings - Fork 64
[Benchmark] geglu example and test #582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
oulgen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we now need transformers, you're gonna need to update the CI to handle it.
|
@Sibylau not needing transformers would be better if you wanna write baselines |
| @@ -0,0 +1,285 @@ | |||
| """ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder does python benchmarks/run.py --op geglu --metrics accuracy pass (i.e. showing accuracy check = 1 for all backends)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it passes. Maybe it's good to post accuracy pass info in each PR, and document the performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sibylau I just merged #596 to allow passing the TB operator instance as the first argument to the Helion integration wrapper geglu_tritonbench - now we should be able to access the TB baseline's model weights in helion tritonbench wrapper and copy the weights into the helion MLP.
It would be great to run the tritonbench accuracy check again to confirm it passes, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/test_examples.py
Outdated
| torch.backends.cuda.matmul.fp32_precision = "tf32" | ||
| torch.backends.cudnn.conv.fp32_precision = "tf32" | ||
| # torch.backends.cuda.matmul.fp32_precision = "tf32" | ||
| # torch.backends.cudnn.conv.fp32_precision = "tf32" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be commenting out these fp32 precision lines - could you share the errors you were seeing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I should've uncommented before pushing the code. But I ran into this error when using these fp32 precision:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/unittest/__main__.py", line 18, in <module>
main(module=None)
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/unittest/main.py", line 101, in __init__
self.parseArgs(argv)
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/unittest/main.py", line 150, in parseArgs
self.createTests()
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/unittest/main.py", line 161, in createTests
self.test = self.testLoader.loadTestsFromNames(self.testNames,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/unittest/loader.py", line 232, in loadTestsFromNames
suites = [self.loadTestsFromName(name, module) for name in names]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/unittest/loader.py", line 232, in <listcomp>
suites = [self.loadTestsFromName(name, module) for name in names]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/unittest/loader.py", line 162, in loadTestsFromName
module = __import__(module_name)
^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jieeliu/workspace/helion/test/test_examples.py", line 17, in <module>
torch.backends.cuda.matmul.fp32_precision = "tf32"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jieeliu/envs/conda/envs/helion/lib/python3.11/site-packages/torch/backends/cuda/__init__.py", line 149, in __setattr__
raise AttributeError("Unknown attribute " + name)
AttributeError: Unknown attribute fp32_precision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm wonder are you using the latest PyTorch nightly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My torch version is
__all__ = ['__version__', 'debug', 'cuda', 'git_version', 'hip', 'xpu']
__version__ = '2.8.0+cu128'
debug = False
cuda: Optional[str] = '12.8'
git_version = 'a1cb3cc05d46d198467bebbb6e8fba50a325d4e7'
hip: Optional[str] = None
xpu: Optional[str] = None
Yeah, I remember it was PyTorch 2.8.0 nightly. I can try different versions too.
examples/geglu.py
Outdated
| down_proj: nn.Linear | ||
|
|
||
|
|
||
| class TritonBenchOperator(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BaselineModel and TritonBenchOperator classes are type annotations to pass pyright type checking. If there are better ways to write, please suggest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep it simple, maybe we can just use object as type for tb_op (similar to
helion/examples/jagged_mean.py
Line 141 in 5190605
| tb_op: object, x: torch.Tensor, B: int, M: int, seqlen: int, sparsity: float |
examples/geglu.py
Outdated
| down_proj: nn.Linear | ||
|
|
||
|
|
||
| class TritonBenchOperator(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep it simple, maybe we can just use object as type for tb_op (similar to
helion/examples/jagged_mean.py
Line 141 in 5190605
| tb_op: object, x: torch.Tensor, B: int, M: int, seqlen: int, sparsity: float |
yf225
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @Sibylau !


[Benchmark] add geglu example and test
For the Triton kernel benchmarking issue #234.