Skip to content

Conversation

@TingDaoK
Copy link
Contributor

@TingDaoK TingDaoK commented Apr 6, 2024

Issue #, if available:

Description of changes:

  • Integrate with libcbor to provide a cbor encode/decode interface based on aws-c-common. Major decision made:

    • Decoding will have peek_type/get_next_*, and peek type will cache the next element, while get_next_* will consume the cached one, or just decode the next one if no cache was stored.
    • Separate type for indefinite map/array for easier handling
    • Separate type for indefinite string/bytes to avoid additional copy (The chunks has to be copied if we want to concatenate them)
  • Debatable: have the cbor in aws-c-common or aws-c-sdkutils, I put it in aws-c-common mostly for:

    • common holds all the external 3rd-party lib for now
    • right now, cbor only required by SDKs, but it's still a generic concept not limited to aws.
  • TODO:

    • Conversion from double to float is tricky, numbers like "1.1" cannot be represented by double or float.
    • Tags, should we be helpful to convert the tag value of timestamp (probably big number in the future) for the user? Or, let user to read the tag value and decide what to do with it??
    • Error handling. Error code, logging, documentation, etc.
    • Fuzz test from our side as well to make sure not undefined behavior
    • Consider:
      • Add separate encoding for float
      • Add decode number to float, decode number to double
      • Add helper to handle indefinite string/bytes

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov-commenter
Copy link

codecov-commenter commented Apr 6, 2024

Codecov Report

Attention: Patch coverage is 92.10526% with 18 lines in your changes missing coverage. Please review.

Project coverage is 83.46%. Comparing base (e620a23) to head (fd7e45b).

Files Patch % Lines
source/cbor.c 91.74% 18 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           cbor-main    #1101      +/-   ##
=============================================
+ Coverage      83.12%   83.46%   +0.33%     
=============================================
  Files             56       57       +1     
  Lines           5755     5982     +227     
=============================================
+ Hits            4784     4993     +209     
- Misses           971      989      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@graebm graebm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at the encoder part of the public header. Apologies for naive comments, since I haven't looked at the decoder yet, or any implementation details...

@graebm graebm changed the title Libcbor intergrate Libcbor integrate Apr 24, 2024
* @return AWS_OP_SUCCESS successfully consumed the next element and get the result, otherwise AWS_OP_ERR.
*/
AWS_COMMON_API
int aws_cbor_decoder_pop_next_array_start(struct aws_cbor_decoder *decoder, uint64_t *out_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this handle indefinite length? document it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still curious

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought we talked about it.

    AWS_CBOR_TYPE_INDEF_BYTES_START,
    AWS_CBOR_TYPE_INDEF_TEXT_START,
    AWS_CBOR_TYPE_INDEF_ARRAY_START,
    AWS_CBOR_TYPE_INDEF_MAP_START,

We have separate types for the indef types. I can add notes to the documents.

Base automatically changed from import-libcbor to cbor-main May 20, 2024 16:12
* - 31 - AWS_CBOR_TYPE_BREAK
* - result of value not supported.
*/
enum aws_cbor_element_type {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just simple enum aws_cbor_type, since all the actual values are just like "AWS_CBOR_TYPE_UINT" anyway and don't bother with the term "element"

* @return AWS_OP_SUCCESS successfully consumed the next element and get the result, otherwise AWS_OP_ERR.
*/
AWS_COMMON_API
int aws_cbor_decoder_pop_next_array_start(struct aws_cbor_decoder *decoder, uint64_t *out_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still curious

source/cbor.c Outdated
Comment on lines 731 to 733
int aws_cbor_decoder_consume_next_element(struct aws_cbor_decoder *decoder) {
enum aws_cbor_element_type out_type = 0;
if (aws_cbor_decoder_peek_type(decoder, &out_type)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be checking decoder->error_code first thing? or is that always handled first-thing by the helper functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aws_cbor_decoder_peek_type will check the error code as the first thing.

source/cbor.c Outdated
Comment on lines 629 to 632
/* TODO: How to check overflow? */
double ms_double = timestamp_secs * 1000.0;
/* Round it up instead of type cast incase of overflow */
int64_t timestamp_ms = llround(ms_double);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know off the top of my head the right way to check, but here's the docs:
https://en.cppreference.com/w/c/numeric/math/round

source/cbor.c Outdated
break;
}
default:
AWS_LOGF_ERROR(AWS_LS_COMMON_CBOR, "The cbor data is malformed to decode.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better error message about how the type isn't expected in a timestamp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as the encoding function, I am currently more leaning towards on removing this helper.

source/cbor.c Outdated
switch (decoder->cached_context.type) {
case AWS_CBOR_TYPE_TAG:
/* Read the next data item */
/* TODO: error check for the tag content?? */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do this before commit, or will there be a follow-up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguably... It's complicate to 100% ensure the tag value is legal.

https://www.rfc-editor.org/rfc/rfc8949.html#tags

We need to check for all those tag values and check those data as well, but we are not even reading those data. Not sure it's worth the effort to ensure we just want to ignores the data item.

eg: if someone sends a bigfloat with an array with 2 string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I have a notes here, maybe I'll just remove this TODO.

@TingDaoK TingDaoK merged commit a3feb8f into cbor-main Jun 6, 2024
@TingDaoK TingDaoK deleted the libcbor-intergrate branch June 6, 2024 22:45
This was referenced Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants