Skip to content

Conversation

@jverzani
Copy link
Contributor

This speeds up N by avoiding use of BasicType to control dispatch in favor of multiple ifelse checks.

@jverzani
Copy link
Contributor Author

This also reorganizes code to put like things together.

The upshot here is

Before


julia> for x in (1,1.1, big(1), big(1.1), π)
           a = Basic(x); @show x; @time float(a)
       end
x = 1
  0.000014 seconds (5 allocations: 96 bytes)
x = 1.1
  0.000028 seconds (3 allocations: 64 bytes)
x = 1
  0.000022 seconds (5 allocations: 96 bytes)
x = 1.100000000000000088817841970012523233890533447265625
  0.000044 seconds (4 allocations: 152 bytes)
x = π
  0.000013 seconds (3 allocations: 72 bytes)

and

julia> @btime convert(Int, u) setup=(u=Basic(rand(1:10)))
  540.225 ns (4 allocations: 80 bytes)

And after

julia> for x in (1,1.1, big(1), big(1.1), π)
           a = Basic(x); @show x; @time float(a)
       end
x = 1
  0.000005 seconds (3 allocations: 56 bytes)
x = 1.1
  0.000002 seconds (1 allocation: 16 bytes)
x = 1
  0.000002 seconds (3 allocations: 56 bytes)
x = 1.100000000000000088817841970012523233890533447265625
  0.000002 seconds (2 allocations: 104 bytes)
x = π
  0.000004 seconds

and

julia> @btime convert(Int, u) setup=(u=Basic(rand(1:10)))
  54.322 ns (2 allocations: 40 bytes)

The last conversion is 10 times faster, the others faster as they require fewer allocations. The main performance improvement being dispatch based on BasicType has overhead, as get_symengine_class is slower due to it needing to allocate a string; and dispatch through Val is slower than ifelse clauses.

@isuruf
Copy link
Member

isuruf commented Mar 19, 2025

Btw, following should reduce the memory allocations, but doesn't get anywhere near this PR.

diff --git a/src/types.jl b/src/types.jl
index 435d995..b9c3512 100644
--- a/src/types.jl
+++ b/src/types.jl
@@ -106,9 +106,24 @@ function get_class_from_id(id::UInt)
     str
 end
 
-"Get SymEngine class of an object (e.g. 1=>:Integer, 1//2 =:Rational, sin(x) => :Sin, ..."
-get_symengine_class(s::Basic) = Symbol(get_class_from_id(get_type(s)))
+function get_symengine_classes()
+  d = Vector{Symbol}()
+  i = 0
+  while true
+    s = get_class_from_id(UInt(i))
+    if s == ""
+      break
+    end
+    push!(d, Symbol(s))
+    i+=1
+  end
+  return d
+end
 
+symengine_classes = get_symengine_classes()
+
+"Get SymEngine class of an object (e.g. 1=>:Integer, 1//2 =:Rational, sin(x) => :Sin, ..."
+get_symengine_class(s::Basic) = symengine_classes[get_type(s) + 1]
 
 ## Construct symbolic objects
 ## renamed, as `Symbol` conflicts with Base.Symbol

@jverzani
Copy link
Contributor Author

Ohh, interesting. I was thinking of using a dictionary to cache calls based on the value of get_type, but didn't realize this could be done so directly. I'll make a PR so we can measure the improvement.

@jverzani jverzani mentioned this pull request Mar 19, 2025
@jverzani
Copy link
Contributor Author

Hi @isuruf, can I merge this?

@isuruf
Copy link
Member

isuruf commented Mar 20, 2025

@jverzani can you check that the last commit doesn't make things slowdown?

@jverzani
Copy link
Contributor Author

Hi @isuruf, I like the dispatch style as the code is much clearer, but for these two basic types it seems a bit slower:

After

julia> @btime N(a) setup=(a = Basic(rand()));
  117.817 ns (1 allocation: 16 bytes)

julia> @btime N(a) setup=(a = Basic(rand(1:10)));
  164.129 ns (2 allocations: 40 bytes)

Before

julia> @btime N(a) setup=(a = Basic(rand()));
  16.680 ns (0 allocations: 0 bytes)

julia> @btime N(a) setup=(a = Basic(rand(1:10))); # allocates a BigInt
  53.545 ns (2 allocations: 40 bytes)

I modified the calling order of N to be (what I think is) a bit more idiomatic, but that's the only difference.

@jverzani
Copy link
Contributor Author

jverzani commented Mar 21, 2025

[I updated this, I used the wrong branch in testing which didn't have the improvements to get_symengine_class]

Hi @isuruf. Here is a quick little benchmark of different styles to compare the performance of runtime dispatch versus a simple ifelse switch:

using SymEngine, BenchmarkTools

using SymEngine: libsymengine, BasicType, is_a_Integer,
    is_a_RealDouble,
    evalf, get_symengine_class


function _convert(::Type{BigInt}, b::Basic)
    a = BigInt()
    _convert_bigint!(a, b)
    return a
end

function _convert_bigint!(a::BigInt, b::Basic) # non-allocating (sometimes)
    is_a_Integer(b) || throw(ArgumentError("Not an integer"))
    ccall((:integer_get_mpz, libsymengine), Nothing, (Ref{BigInt}, Ref{Basic}), a, b)
    a
end

function _convert(::Type{Cdouble}, b::Basic)
    is_a_RealDouble(b) || throw(ArgumentError("Not a real double"))
    return ccall((:real_double_get_d, libsymengine), Cdouble, (Ref{Basic},), b)
end

function _N_Integer(b::Basic)
    a = _convert(BigInt, b)
    if (a.size > 1 || a.size < -1)
        return a
    elseif (a.size == 0)
        return 0
    else
        u = unsafe_load(a.d, 1)
        w = signed(u)
        if w > 0
            return w * a.size
        elseif a.size == 1
            return u
        else
            return a
        end
    end
end


function _N_RealDouble(b::Basic)
    return ccall((:real_double_get_d, libsymengine), Cdouble, (Ref{Basic},), b)
end

_N_RealDouble(b::Basic) = _convert(Cdouble, b)

# non mutating
struct _BasicType{T} <: Number
    x::Basic
end
_BasicType(x::Basic) = _BasicType{Val{get_symengine_class(x)}}(x)
Basic(x::_BasicType) = x.x


## no dispatch
function Na(x::Basic)
    if is_a_Integer(x)
        return _N_Integer(x)
    elseif is_a_RealDouble(x)
        return _N_RealDouble(x)
    else
        return _N_RealDouble(evalf(x, 53, true))
    end
end


## BasicType
Nb(x::Basic) = Nb(BasicType(x))
Nb(x::BasicType{Val{:Integer}}) = _N_Integer(Basic(x))
Nb(x::BasicType{Val{:RealDouble}}) = _N_RealDouble(Basic(x))
Nb(x::BasicType{Val{T}}) where {T} = Nb(evalf(Basic(x), 53, true))


## Val
Nc(x::Basic) = Nc(Val(get_symengine_class(x)), x)
Nc(::Val{:Integer}, x) = _N_Integer(x)
Nc(::Val{:RealDouble}, x) = _N_RealDouble(x)
Nc(::Val{<:Any}, x) = Nc(evalf(x, 53, true))

## BasicType
Nd(x::Basic) = Nd(_BasicType(x))
Nd(x::_BasicType{Val{:Integer}}) = _N_Integer(Basic(x))
Nd(x::_BasicType{Val{:RealDouble}}) = _N_RealDouble(Basic(x))
Nd(x::_BasicType{Val{T}}) where {T} = Nd(evalf(Basic(x), 53, true))

for meth in (Na,Nb,Nc,Nd)
    @show nameof(meth)
    @btime $meth(a) setup=(a=Basic(rand(1:10)))
    @btime $meth(a) setup=(a=Basic(rand()))
    @btime $meth(a) setup=(a=Basic(big(rand())))
end

This gives

nameof(meth) = :Na
  51.731 ns (2 allocations: 40 bytes)
  13.763 ns (0 allocations: 0 bytes)
  907.711 ns (7 allocations: 152 bytes)
nameof(meth) = :Nb
  257.930 ns (3 allocations: 56 bytes)
  209.437 ns (1 allocation: 16 bytes)
  1.378 μs (10 allocations: 200 bytes)
nameof(meth) = :Nc
  147.665 ns (2 allocations: 40 bytes)
  97.182 ns (0 allocations: 0 bytes)
  1.151 μs (8 allocations: 168 bytes)
nameof(meth) = :Nd
  258.721 ns (3 allocations: 56 bytes)
  208.662 ns (1 allocation: 16 bytes)
  1.384 μs (10 allocations: 200 bytes)

I was expecting Nd to be similar to Nc, but there is an extra allocation.

I think the construction of a Val object from a Basic object, is where the performance is lost, as a check like is_a_RealDouble is basically free:

julia> x = Basic(1.23)
julia> @btime Val{get_symengine_class($x)}
  77.041 ns (0 allocations: 0 bytes)

@isuruf
Copy link
Member

isuruf commented Mar 21, 2025

Can you try the following?

using SymEngine, BenchmarkTools

using SymEngine: BasicType, is_a_Integer, _N_Integer,
    is_a_RealDouble, _N_RealDouble,
    evalf, get_symengine_class

# non mutating
struct _BasicType{T} <: Number
    x::Basic
end
_BasicType(x::Basic) = _BasicType{Val{get_symengine_class(x)}}(x)
Basic(x::_BasicType) = x.x

## no dispatch
function Na(x::Basic)
    if is_a_Integer(x)
        return _N_Integer(x)
    elseif is_a_RealDouble(x)
        return _N_RealDouble(x)
    else
        return _N_RealDouble(evalf(x, 53, true))
    end
end

## BasicType
Nb(x::Basic) = Nb(BasicType(x))
Nb(x::BasicType{Val{:Integer}}) = _N_Integer(Basic(x))
Nb(x::BasicType{Val{:RealDouble}}) = _N_RealDouble(Basic(x))
Nb(x::BasicType{Val{T}}) where {T} = Nb(SymEngine.evalf(Basic(x), 53, true))

const symengine_classes_val = [Val(c) for c in SymEngine.symengine_classes]

## Val
Nc(x::Basic) = Nc(symengine_classes_val[SymEngine.get_type(x) + 1], x)
Nc(::Val{:Integer}, x::Basic) = _N_Integer(x)
Nc(::Val{:RealDouble}, x::Basic) = _N_RealDouble(x)
Nc(::Val{<:Any}, x::Basic) = Nc(evalf(x, 53, true))

## non-mutable BasicType
Nd(x::Basic) = Nd(_BasicType(x))
Nd(x::_BasicType{Val{:Integer}}) = _N_Integer(Basic(x))
Nd(x::_BasicType{Val{:RealDouble}}) = _N_RealDouble(Basic(x))
Nd(x::_BasicType{Val{T}}) where {T} = Nd(evalf(Basic(x), 53, true))

const integer_type_id::UInt64 = findfirst(==(:Integer), SymEngine.symengine_classes) - 1
const real_double_type_id::UInt64 = findfirst(==(:RealDouble), SymEngine.symengine_classes) - 1

function Ne(x::Basic)
    t::UInt64 = SymEngine.get_type(x)
    if t == 0
        return _N_Integer(x)
    elseif t == 6
        return _N_RealDouble(x)
    else
        return _N_RealDouble(evalf(x, 53, true))
    end
end

for meth in (Na,Nb,Nc,Nd,Ne)
    @show nameof(meth)
    @btime $meth(a) setup=(a=Basic(rand(1:10)))
    @btime $meth(a) setup=(a=Basic(rand()))
    @btime $meth(a) setup=(a=Basic(big(rand())))
end

@jverzani
Copy link
Contributor Author

This is great! I now see:

nameof(meth) = :Na
  65.827 ns (2 allocations: 40 bytes)
  27.206 ns (1 allocation: 16 bytes)
  914.594 ns (8 allocations: 168 bytes)
nameof(meth) = :Nb
  256.362 ns (3 allocations: 56 bytes)
  207.072 ns (1 allocation: 16 bytes)
  1.366 μs (10 allocations: 200 bytes)
nameof(meth) = :Nc
  54.435 ns (2 allocations: 40 bytes)
  13.049 ns (0 allocations: 0 bytes)
  929.333 ns (8 allocations: 168 bytes)
nameof(meth) = :Nd
  257.756 ns (3 allocations: 56 bytes)
  208.503 ns (1 allocation: 16 bytes)
  1.363 μs (10 allocations: 200 bytes)
nameof(meth) = :Ne
  51.705 ns (2 allocations: 40 bytes)
  11.118 ns (0 allocations: 0 bytes)
  898.667 ns (7 allocations: 152 bytes)

The Nc approach seems like winner, fast and still quite readable! I'll add this to the current PR.

@jverzani
Copy link
Contributor Author

Thanks so much. With this most recent tweak, I'm seeing this:


julia> for meth in (N,)
           @show nameof(meth)
           @btime $meth(a) setup=(a=Basic(rand(1:10)))
           @btime $meth(a) setup=(a=Basic(rand()))
           @btime $meth(a) setup=(a=Basic(big(rand())))
       end
nameof(meth) = :N
  70.198 ns (2 allocations: 40 bytes)
  33.239 ns (1 allocation: 16 bytes)
  47.916 ns (2 allocations: 104 bytes)

I don't know why there is the 1 allocation for a Float64 (there were 0 in the test cases) but this is much more performant that the currently tagged version:

nameof(meth) = :N
  404.950 ns (4 allocations: 80 bytes)
  438.843 ns (3 allocations: 64 bytes)
  426.176 ns (4 allocations: 152 bytes)

@jverzani
Copy link
Contributor Author

Hi @isuruf, I'm done tinkering here, unless you have other ideas.

@jverzani jverzani merged commit 6519f4c into symengine:master Mar 30, 2025
9 checks passed
@jverzani jverzani deleted the N_speedup branch March 30, 2025 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants