Skip to content

Unwanted UNICODE string prefix breaks LIKE operator with a CHAR field in 8-bit codepage #285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ondrejtucny opened this issue Nov 9, 2022 · 2 comments

Comments

@ondrejtucny
Copy link

I have a pair of legacy tables d_limit and c_limit_type with a N:1 relationship. The legacy SQL code uses a [code] LIKE '_s' condition, which I cannot reasonably avoid right now.

The following legacy select works as intended, finding a d_limit row which is linked to a c_limit_type having the code column of the provided pattern:

select * from d_limit a
	join c_limit_type b on a.id_limit_type = b.id_limit_type
	where a.id_card = 4569828 and a.id_arch = 0 and b.[code] like '_s'

However, when I use the following query:

// find the sale limit
var limit = s.Query.All<DLimit>().FirstOrDefault(lim => lim.CardId == cardId && lim.IdArch == 0 && lim.LimitType.Code.Like("_s"));

the query returns nothing. The problem is the generated SQL:

exec sp_executesql N'SELECT [a].[id_limit], [a].[id_arch], 124 AS [TypeId], [a].[id_card], [a].[id_limit_type], [a].[limit], [a].[id_user], [a].[datum] 
	FROM [dbo].[d_limit] [a] 
	INNER JOIN [dbo].[c_limit_type] [b] ON ([a].[id_limit_type] = [b].[id_limit_type]) 
	WHERE ( ([a].[id_card] = @p0_0) AND  ( CAST([a].[id_arch]  AS integer) =  0) AND  ([b].[code] LIKE  N''_s'')) 
	ORDER BY  [a].[id_limit] ASC,  [a].[id_arch] ASC OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY;
',N'@p0_0 int',@p0_0=4569828

Here, the LIKE operator has the operand N'_s', and the UNICODE N specifier silently breaks the query. Apparently, it's not compatible with the data type of [code] which is CHAR(8). The database uses the SQL_Czech_CP1250_CI_AS collation, and hence an 8-bit code page.

I think it is an error to generate N'...' strings when querying a database with an 8-bit code page, and, specifically, when the other argument of an operator is a CHAR (or alike) and not an NVAR (or like).

In addition, please note DO here generates an unwanted cast as well: CAST([a].[id_arch] AS integer) = 0 — this cast is completely useless and I believe it is triggered by the fact that id_arch is a smallint here, and the constant is an int in C#. I think this is essentially the same issue as #189, which I reported earlier.

I am using release 7.1.0-Beta-2.

@alex-kulakov
Copy link
Contributor

alex-kulakov commented Mar 23, 2023

Hello @ondrejtucny,

Since dotnet strings are natively in unicode we chose n-prefixed types to have no problems with storing data of any language out of the box. When DO creates tables structure it uses nchar, not char. I assume you manage tables structure by yourself and build Domain in one of Legacy modes.

By the way the problem is not in N'_s', because MS SQL Server is able to implicitly cast between nchar and char. For example, in the query below all the where statements work just fine for first name "Kyle"

SELECT TOP (1000) [Id]
      ,[FirstNameNVarchar]
     ,[FirstNameVarchar]
     ,[FirstNameChar8]
  FROM [DO-Tests].[dbo].[Customer]
  --where [FirstNameNVarchar] Like N'K_le'
  --where [FirstNameNVarchar] Like 'K_le'
  --where [FirstNameVarchar] Like N'K_le'
  --where [FirstNameVarchar] Like 'K_le'
  --where [FirstNameChar8] Like N'K_le%'
  --where [FirstNameChar8] Like 'K_le'

The caviar here is that char has fixed number of characters so if you store "Kyle" to char(8) it is "Kyle____" where '_' is whitespace . In your case N'_s%' would return some values because % will substitute all the additional whitespaces which exist in actual value.

So, the options are

  1. use type with variable length and LIKE will work with N'' and '' variants
  2. add % to compensate whitespaces
  3. you can use DOs extendibility and have your own Like method. It could be similar to
  public static class CustomStringExtensions
  {
    public static bool AnsiLike(this string value, string sqlLikePattern)
    {
      var regexPattern = Regex.Replace(sqlLikePattern,
        @"[%_]|[^%_]+",
        match => {
          if (match.Value == "%") {
            return ".*";
          }
          if (match.Value == "_") {
            return ".";
          }
          return Regex.Escape(match.Value);
        });
      return Regex.IsMatch(value, $"^{regexPattern}$");
    }
  }

  [CompilerContainer(typeof(SqlExpression))]
  internal static class CustomStringCompilers
  {
    [Compiler(typeof(CustomStringExtensions), nameof(CustomStringExtensions.AnsiLike), TargetKind.Static | TargetKind.Method)]
    public static SqlExpression StringAnsiLike(SqlExpression _this,
      [Type(typeof(string))] SqlExpression pattern)
    {
      var value = (pattern as SqlLiteral<string>).Value;
      var newPattern = SqlDml.Native('\'' + value + '\'');
      return SqlDml.Like(_this, newPattern);
    }
  }

just don't forget to register CustomStringCompilers to Domain types as well as Entities

@ondrejtucny
Copy link
Author

Thanks for the workaround!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants