Skip to content

Support for non-numeric column type parameters #171

@blazewicz

Description

@blazewicz

Parser fails on parametrized column type definitions using non-numeric parameters, type parameter is always assumed to be a size associated with the type and parser expects it to be an int.

PostGIS, a pupular PostgreSQL extension for spatial data adds some extra types, for example Geometry, which are parametrized with a subtype (docs).

Example table definition:

CREATE TABLE t1 (
    p Geometry(MultiPolygon, 26918)
);

Tool fails with following traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../site-packages/simple_ddl_parser/parser.py", line 345, in run
    self.tables = self.parse_data()
  File ".../site-packages/simple_ddl_parser/parser.py", line 257, in parse_data
    self.process_line(num != len(lines) - 1)
  File ".../site-packages/simple_ddl_parser/parser.py", line 287, in process_line
    self.process_statement()
  File ".../site-packages/simple_ddl_parser/parser.py", line 292, in process_statement
    self.parse_statement()
  File ".../site-packages/simple_ddl_parser/parser.py", line 300, in parse_statement
    _parse_result = yacc.parse(self.statement)
  File ".../site-packages/ply/yacc.py", line 333, in parse
    return self.parseopt_notrack(input, lexer, debug, tracking, tokenfunc)
  File ".../site-packages/ply/yacc.py", line 1120, in parseopt_notrack
    p.callable(pslice)
  File ".../site-packages/simple_ddl_parser/dialects/sql.py", line 283, in p_column
    self.set_column_size(p_list, p)
  File ".../site-packages/simple_ddl_parser/dialects/sql.py", line 291, in set_column_size
    p[0]["size"] = self.get_size(p_list)
  File ".../site-packages/simple_ddl_parser/dialects/sql.py", line 249, in get_size
    value_0 = int(p_list[-3])
ValueError: invalid literal for int() with base 10: 'MultiPolygon'

Describe the solution you'd like

The assumption about size being the only parameter should be replaced with a list of types for which this is known to be true. In all cases the parser could return an additional field with a list of the parameters associated with the type. This shouldn't be a breaking change.

Example output:

        "tables": [
            {
                "columns": [
                    {
                        "name": "p",
                        "type": "geometry",
                        "type_params": ["MultiPolygon", 26918],
                        "size": None,
                        "references": None,
                        "unique": False,
                        "nullable": True,
                        "default": None,
                        "check": None,
                        "on_update": None,
                    }
                ],
                ...
            }
        ],

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions