Skip to content

Conversation

@juj
Copy link

@juj juj commented Oct 24, 2019

Fix GLES2 emulated integer functions op_and(), op_or() and op_xor() to work properly for negative input values, and optimize to avoid control flow.

E.g. before:

int BITWISE_BIT_COUNT = 32;

int op_modi(int x, int y)
{
   return x - y * (x / y);
}

int op_and(int a, int b)
{
   int result = 0;
   int n = 1;
   for (int i = 0; i < BITWISE_BIT_COUNT; i++)
   {
      if ((op_modi(a, 2) != 0) && (op_modi(b, 2) != 0))
      {
         result += n;
      }
      a = a / 2; // Bug: Fails to shift right if a < 0
      b = b / 2; // Bug: Fails to shift right if b < 0
      n = n * 2;
      if (!(a > 0 && b > 0)) // Bug: Fails if a < 0 or b < 0 (though simple fix to change to test a != 0 && b != 0)
      {
         break;
      }
   }
   return result;
}

After:

int op_and(int a, int b)
{
   // First extract the sign bit to convert inputs to positive values.
   int result = (a < 0 && b < 0) ? -2147483648 : 0;
   if (a < 0) a -= -2147483648;
   if (b < 0) b -= -2147483648;
   int n = 1;
   ivec2 ab = ivec2(a, b); // Use vectorization
   for (int i = 0; i < 31; i++) // Loop excluding the sign bit
   {
      ivec2 ab_div = ab / 2;
      ivec2 ab_rem = ab - ab_div*2; // Avoid calling op_modi() to optimize away integer divs.
      // Here ab_rem.x and ab_rem.y are either 0 or 1.
      result += n * ab_rem.x * ab_rem.y; // for one-bit values a and b,  a & b == a*b
      ab = ab_div;
      n += n;
      // At the end avoid test "if (a == 0 || b == 0) break;", as counterproductive
   }
   return result;
}

Similar transformations to op_or and op_xor.

In case of op_or, for two one-bit inputs a and b, a | b is implemented as (a+b)/2.

In case of op_xor, a & b is implemented using int(ab_rem.x != ab_rem.y).

…o work properly for negative input values, and optimize to avoid control flow.
@juj juj force-pushed the fix_gles2_integer_ops branch from fceb3d1 to 3ef11ff Compare September 22, 2020 11:48
@Lssikkes
Copy link

Lssikkes commented Jul 3, 2021

Thanks for this juj, I've pulled it into my local branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants