-
Notifications
You must be signed in to change notification settings - Fork 35
Standard MIME content-type #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'd rather prefer In addition to the Media Type, a registered structured suffix may be interesting. In my eyes even more useful, to create media types like See also: @wardi have you considered filing a registration for a json-lines Media Type and structured suffix at IANA? |
There is an IETF RFC 7464 for JSON Text Sequences that uses mime type: It allows prefixing each JSON record with <RS> control character and requires ending each JSON record with <LF>. |
This seems like a duplicate of #9. The whole purpose of the |
The lack of a definitive IANA Media Type for JSON Lines causes some difficulty for those of us using the format. In the interest of pushing the issue, I took the liberty of starting a conversation: Perhaps someone here would like to join that thread? Disclaimer: I am in no way affiliated with the IANA/IETF. I am merely interested in using the format, correctly. |
@whlavina the response from Tim Bray was the most helpful and it looks nothing had happened since then. I'll copy the interesting bit here for reference
|
I am linking the relevant RFC to suggest new MIME type for standardisation: https://www.rfc-editor.org/rfc/rfc6838.html I propose working on adding the mime type Among the two ways they list to get it added to the standard tree:
I think the second one is the most relevant, which leads to https://www.rfc-editor.org/rfc/rfc5226 |
Hi @sp4ce, good to see that someone is leading the way to an actual RFC! I've noticied that AWS is (apparently) using JSON Lines for one of their products. I haven't seen a description of the actual output to know whether or not it is compatible with JSON Lines. In any case they are using the mime type |
AWS Claim it's compatible with JSON Lines - it links to the JSON Lines homepage https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/S3DataExport.Output.html |
There's an [ongoing discussion](wardi/jsonlines#19) about what the MIME type for [JSONL](https://jsonlines.org/) files should be. Making it `application/jsonl` leads to the file being downloaded according to my testing, which prevents browsers from opening them in a new window and parsing them as JSON, which fixes btcpayserver#5488.
There's an [ongoing discussion](wardi/jsonlines#19) about what the MIME type for [JSONL](https://jsonlines.org/) files should be. Making it `application/jsonl` leads to the file being downloaded according to my testing, which prevents browsers from opening them in a new window and parsing them as JSON, which fixes #5488.
If there's still interest in doing this, I would recommend an informational track internet-draft (I-D) to describe the jsonlines specification, with an IANA considerations section registering the media type. The idea is that drafts work towards RFCs work towards standards on a long evolutionary track of internet draft to RFC, and potentially to being an internet standard. IETF wants to deal with immutable and permanently available documents, so you will likely need represent the encoding and parsing requirements authoritatively within the I-D itself, using IETF nomenclature. There's a lot of references to this available, and the JSON Text Sequences RFC is likely an excellent example. I suspect there will be feedback that some areas are not needed. For example, your UTF-8 encoding rule does not have much left to it once you reference the JSON RFC. That RFC already mandates UTF-8 for everything other than closed ecosystems.At that point, you have to decide whether the application "advice" that they might want to escape the string to work on ASCII transports becomes something you might want to represent as an application note on the jsonlines site, and a discussion you have with the IETF more broadly - after all, it would also affect JSON and json sequence data over such transports. Conversely, you may want to be quite a bit more specific for the sake of interoperability, such as whether applications MUST be able to consume |
What's wrong with what ndjson is trying to implement? Their current standard is |
The The lack of an immutable standard (like a RFC with a number) means that ndjson three years from now may make changes along lines like these for robustness, but implementations do not have a clear way to explain what they are compatible with. There are plenty of commercial products which use vendor and x-prefixed media types, and which do not attempt to define fixed/robust/interoperable behavior. It is a matter of what this project is going for, which is why my first words were "If there's still interest in doing this". In terms of ramifications, most SDOs (standard defining organizations) won't touch dependencies which do not have these and other formalisms, and may use things like publication in another SDO (like IETF) as a sign of that. That means ndjson/jsonlines may be used in public facing API, but a large category of interoperable standards work either wouldn't touch it, or will standardize their own similar effort. |
Well that's the problem, it might happen, at some point in the future. Given the usage of JSON lines in various commercial products, we're suggesting we do that formalisation now - or at least start the process very soon! |
I'd love to see this. So do we copy-paste JSON-SEQ https://datatracker.ietf.org/doc/html/rfc7464 without the "ASCII Record Separator (0x1E)"? JSON-SEQ discusses detecting truncated records and continuing a fair bit, all of that could be removed in a new RFC.
Rule 3 in https://jsonlines.org/ mentions that a compliant parser will be able to consume Lines of only whitespace are already invalid by rule 2 in https://jsonlines.org/ , but again it doesn't hurt to make this clear. To be specific let's say that any line that doesn't parse as valid JSON should be treated as an invalid record but still counts as a record for the purpose of numbering the lines. |
Should it count as a record? The whole point of something called JSON Lines is that it stores lines of a well defined format called JSON, not arbitrary character sequences. Depending on the nature on malformed data in a line it might as well make all other lines after it invalid and blow up logs with parsing errors noise when the offender is a single line (a whole file). |
I think RFCs are copyrighted so to copy paste you would need permission of the original author |
I'm glad to see continued discussion and forward movement. It's interesting to see that YAML just recently (this month) gained IANA media type registration... 22 years after the format was first created. If YAML can do it, JSON Lines can, too! If there's any need for help with the process, maybe we could ask the folks who pushed the YAML RFC? |
Here's the guidelines on how to write an Internet Draft |
As of last month, that (expired) draft is replaced by https://www.ietf.org/archive/id/draft-ietf-mediaman-standards-tree-01.html |
JSON lines is a perfect solution for JSON streaming that is a common task today for features that use LLMs. I attempted to use I didn't enjoy this solution because I wanted the
At this case, the response is interpreted as text and isn't being downloaded, but also the actual type of the content can be read from the It looks kinda right and kinda wrong at the same time, but since we don't have any standard header yet, this can be used as a solid temporary solution. Let me know what you think. |
|
@GabenGar I agree, but what would be a good alternative for self-described content type of JSON lines that can be read by any client that doesn't support the proposed one? Maybe the parameter should be prefixed with
|
For what purpose do you want to violate the http spec? |
Do I? Is it directly forbidden to use custom params? |
You didn't answer the question. |
The question makes no sense. Clearly @finom doesn't want to violate any HTTP specifications - which their suggestion doesn't. See the actual specification, rather than just the developer reference: https://httpwg.org/specs/rfc9110.html#field.content-type There's no exhaustive list of valid parameters, so there's no rule again using |
You really should not use the text top level type unless your intention is that an end user (not developer) would be able to read the data without tooling, for this reason (you get end-user display of raw data by browsers when there is no system registered tooling) I’d recommend if this is an API endpoint, instead leverage the Accepts header. If the http client does not prefer application/jsonlines over text/plain, they get a plaintext version with text/plain media type. You can then disable the text/plain accepts handler in production.
Format is a valid media type parameter for text/plain (RFC 2646), but it dictates things like newline interpretation. Using it to say “no I really mean this other media type” is certainly not its intended use or how media types are intended to work. Having an X-format doesn’t really make that better - X is really meant for indicating a parameter or type may have conflicts in independent usage due to not being registered, not “I want to do something entirely different than the thing with this unprefixed name”. The latter is creating something knowingly confusing, even before you get to this idea being counter to the purpose of media types |
@dwaite I've just finished implementing your idea to switch between content-types based on
If accepts doesn't include |
We also implemented the json lines as an alternative solution to streamed (normal) json for big data sets. After we did it, we found https://jsonlines.org/ and realized we just reinvented an already invented wheel. We left Content-Type application/json and limit=-1 as a request for this type of streamed response, but application/jsonl seems the easiest mime type for this. |
Hi, is there still interesst in writing an RFC for a custom MIME type? I would like to support you. Always wanted the write an RFC ;-) |
@obfischer for us, it would mean we can replace application/Json with application/jsonl when the response should be streames json lines with limit > 0 (meaning just a page). |
btw, do I understand correctly that output of claude code --output-format
json-stream... is actually JSONL and that's how they should allow to
specify it in a flag?
If yes, maybe can someone, more knowledgeable, file issue against them
pointing to issues here?
We may spres JSONL awareness through back linking?
…On Fri, 16 May 2025, 06:18 macropay-solutions, ***@***.***> wrote:
*macropay-solutions* left a comment (wardi/jsonlines#19)
<#19 (comment)>
@obfischer <https://github.com/obfischer> for us, it would mean we can
replace application/Json with application/jsonl when the response should be
streames json lines with limit > 0 (meaning just a page).
—
Reply to this email directly, view it on GitHub
<#19 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABWBW2LAUAKGOF2CJBDPS326VRIPAVCNFSM6AAAAABXHSDXD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOBVGU4TAMBXGI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I was really interested as well at the beginning, but I lack the experience and the understanding of all the required steps. It seems that it starts here https://authors.ietf.org/en/home and you can try to give it a shot, or we can have a call together one of these days to help us go through it. Let me know. Thanks. PS: there was also this initiative that would allow community registration in the standard tree https://www.ietf.org/archive/id/draft-ietf-mediaman-standards-tree-01.html CC @ferdnyc @darrelmiller |
@obfischer @sp4ce I'd be down to help too :) I'm a fan of the format and am quite irritated that this hasn't become a standard yet :D Having had a short look at the Authors Resources, it does not seem too complicated. I guess the best for us would be to create a repo from this template and collaborate there, GitHub-style, with Markdown editing and merge requests. As pointed out in the email thread, we would need a stable spec to register a data type / syntax suffix, so let's think about that later and focus on the RFC first. |
You'd better define the format in the RFC. You would end up with something which would look a lot like RFC 7464 😄 |
Sorry, my last comment got lost. I am afraid I didn't hit the comment button before leaving. ;-) Yes, let us simply start. I will provide a repository for our work today in the evening. |
Why creating a new repo? We could use this one, I can create a new section in the menu called |
Ok, please do it ;-) |
If we use the template, we get automatic lifting and rendering via the pipelines for free. But for the start, a draft-draft-RFC in this repo is enough :) |
Valid point. So let us create a new repository. What do you think? |
What do you think about adding new HTTP content-type for jsonlines data.
What about
application/jsonl
?The text was updated successfully, but these errors were encountered: