Skip to content

Make it possible to specify folders layout to be other than sub-{label}/[ses-{label}/] #54

@yarikoptic

Description

@yarikoptic

Origin: Originally summarized/presented in bids-standard/bids-specification#751 (comment) (not duplicating here for now) while discussing a possible "stimuli BEP" and where it boiled down to having some stim-{label}/ folders structure either at top level or under stimuli/, which is currently no defining any structure to use there.
Current state: many usecases collected (see e.g. below), design being formalized in

Other relevant issues in this bids-2-devel or elsewhere I found which would be partially or fully addressed with such enhancement

  • Multi-site/center studies #11 : add /site-<site_label> level in favor of encoding it within /ses-{label}

  • Issues with dandi organize dandi/dandi-cli#1302 - in DANDI we support a lightweight "BIDS-inspired" layout (while BEP032 is still being worked on) which has no /ses-{label} subfolder since makes little sense since lots of sessions and 1 file per session with possibly already a long file name due to long sub and ses labels.

  • https://bids.neuroimaging.io/bep038 - Atlases BEP... IMHO could have atlas-<label>/ top level structure for the entity atlas

    • gives a use case inspiring the description to allow for such "leading prefix atlases/" description as well. So we might have smth like {'.': ["subject", "[session]", "datatype"], 'atlases': ["atlas"]} to describe that on top level we separate at subject level and under atlases/ -- at "atlas", but for a dataset which is purely an "atlas" dataset, it could be {'.': ["atlas"]}
  • https://bids.neuroimaging.io/bep035 - MEGA (Modular extensions for individual participant data mega-analyses) BEP. Proposes study- entity at the top level and studies.tsv to summarize.

  • would provide a solution for Allow composition of a BIDS dataset (dataset level) from smaller (subj or subj/ses) level #59

    example (prototype since we have not boiled down syntax)

    top level dataset_description.json could have "default" one

    "DatasetLayout": { "." : [{ "entity": "subject", "folder": true }, { "entity": "session", "folder": true }] }

    whenever nested BIDS dataset at sub-XXX/ses-YYY/ level have

    "DatasetLayout": { "." : [{ "entity": "subject", "folder": false }, { "entity": "session", "folder": false }] }

    thus signaling that sub-XXX_ses-YYY_ should still be within the target filename as a prefix but no leading directories should be there.

  • in the scope of stimuli BEP (XXX, google doc), to accommodate large stimuli databases, such as https://cocodataset.org/ with 330K images, it would require some groupping. But we would need to figure out how to group in general -- would require more entities than just stim-

  • some heavy datasets might want even more entities to be used. E.g. in https://dandiarchive.org/dandiset/000026, there are thousands files for about 50 different _sample-s under sub-I38/ses-SPIM/micr so it would have been logical to add sample-<label>/ level and samples.tsv to describe them, make it smth like

    "DatasetLayout": { "." : [{ "entity": "subject"}, { "entity": "session" }, { "entity": "sample" }] }

Metadata

Metadata

Assignees

No one assigned

    Labels

    folder-structureProposals to reorganize files in the specification.impact: highEstimated high impact changemodularityIssues affecting modularity and composition of BIDS datasets

    Type

    No type

    Projects

    Status

    In Progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions