You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -18,9 +18,19 @@ However, for Linux, you would have to wait until 2016, when [`manylinux` wheels
18
18
19
19
As an example, take a look at the [PyPI download page for `numpy` 1.7.0](https://pypi.org/project/numpy/1.7.0/#files), released in Feb 2013. The "Built Distributions" section only shows a few `.exe` files for Windows (!), and some `manylinux1` wheels. However, the `manylinux1` wheels were not uploaded until April 2016. There was no mention whatsoever of macOS. Now compare it to [`numpy` 1.11.0](https://pypi.org/project/numpy/1.11.0/#files), released in March 2016: wheels for all platforms!
20
20
21
-
The reason why it is hard to find packages for a specific system, and why compilation was the preferred option for many, is [binary compatibility][abi]. Binary compatibility is a window of compatibility where each combination of compiler version, core libraries such as `glibc`, and dependency libraries present on the build machine are compatible on destination systems.
22
-
23
-
Linux distributions achieve this by freezing compiler versions and library versions for a particular release cycle. Windows achieves this relatively easily because Python standardized on particular Visual Studio compiler versions for each Python release. Where a Windows package executable was reliably redistributable across versions of Windows, so long as Python version was the same, Linux presented a more difficult target because it was (and is) so much harder to account for all of the little details that must line up.
21
+
The reason why it is hard to find packages for a specific system, and why
22
+
compilation was the preferred option for many, is [binary compatibility][abi].
23
+
Binary compatibility is a window of compatibility where each combination of
24
+
compiler version, core libraries such as `glibc`, and dependency libraries
25
+
present on the build machine are compatible on destination systems.
26
+
27
+
Linux distributions achieve this by freezing compiler versions and library
28
+
versions for a particular release cycle. Windows achieves this relatively easily
29
+
because Python standardized on particular Visual Studio compiler versions for
30
+
each Python release. Where a Windows package executable was reliably
31
+
redistributable across versions of Windows, so long as Python version was the
32
+
same, Linux presented a more difficult target because it was (and is) so much
33
+
harder to account for all of the little details that must line up.
24
34
25
35
## The origins of `conda`
26
36
@@ -57,38 +67,110 @@ There was a lot of cross-pollination between projects/channels, but projects wer
57
67
Continuum Analytics is now Anaconda, but this article tries to keep the company name contemporaneous with the state of the world.
58
68
:::
59
69
60
-
It's a little strange to describe Continuum's history here, but the company history is so deeply intertwined with conda-forge that it is essential for a complete story. During this time, Continuum (especially Ilan Schnell ([@ilanschnell](https://github.com/ilanschnell))) was developing its own internal recipes for packages. Continuum's Linux toolchain at the time was based on CentOS 5 and GCC 4.8. These details matter, because they effectively set the compatibility bounds of the entire conda package ecosystem.
61
-
62
-
The packages made from these internal recipes were available on the `free` channel, which in turn was part of a metachannel named `defaults`. The `defaults` channel made up the initial channel configuration for the Miniconda and Anaconda installers. Concurrently, Aaron Meurer ([@asmeurer](https://github.com/asmeurer)) led the `conda` and `conda-build` projects, contributed many recipes to the `conda-recipes` repository and built many packages on his `asmeurer` binstar.org channel.
63
-
64
-
Aaron left Continuum in late 2015, leaving the community side of the projects in need of new leadership. Continuum hired Kale Franz ([@kalefranz](https://github.com/kalefranz)) to fill this role. Kale had huge ambitions for conda, but `conda-build` was not as much of a priority for him. Michael Sarahan ([@msarahan](https://github.com/msarahan)) stepped in to maintain `conda-build`.
65
-
66
-
In 2016, Rich Signell at USGS connected Filipe and Phil with Travis Oliphant at Continuum, who assigned Michael Sarahan to be Continuum's representative in `conda-forge`. Ray Donnelly ([@mingwandroid](https://github.com/mingwandroid)) joined the team at Continuum soon afterwards, bringing extensive experience in package managers and toolchains from his involvement in the MSYS2 project.
70
+
It's a little strange to describe Continuum's history here, but the company
71
+
history is so deeply intertwined with conda-forge that it is essential for a
72
+
complete story. During this time, Continuum (especially Ilan Schnell
73
+
([@ilanschnell](https://github.com/ilanschnell))) was developing its own
74
+
internal recipes for packages. Continuum's Linux toolchain at the time was based
75
+
on CentOS 5 and GCC 4.8. These details matter, because they effectively set the
76
+
compatibility bounds of the entire conda package ecosystem.
77
+
78
+
The packages made from these internal recipes were available on the `free`
79
+
channel, which in turn was part of a metachannel named `defaults`. The
80
+
`defaults` channel made up the initial channel configuration for the Miniconda
81
+
and Anaconda installers. Concurrently, Aaron Meurer
82
+
([@asmeurer](https://github.com/asmeurer)) led the `conda` and `conda-build`
83
+
projects, contributed many recipes to the `conda-recipes` repository and built
84
+
many packages on his `asmeurer` binstar.org channel.
85
+
86
+
Aaron left Continuum in late 2015, leaving the community side of the projects in
87
+
need of new leadership. Continuum hired Kale Franz
88
+
([@kalefranz](https://github.com/kalefranz)) to fill this role. Kale had huge
89
+
ambitions for conda, but `conda-build` was not as much of a priority for him.
90
+
Michael Sarahan ([@msarahan](https://github.com/msarahan)) stepped in to
91
+
maintain `conda-build`.
92
+
93
+
In 2016, Rich Signell at USGS connected Filipe and Phil with Travis Oliphant at
94
+
Continuum, who assigned Michael Sarahan to be Continuum's representative in
95
+
`conda-forge`. Ray Donnelly ([@mingwandroid](https://github.com/mingwandroid))
96
+
joined the team at Continuum soon afterwards, bringing extensive experience in
97
+
package managers and toolchains from his involvement in the MSYS2 project.
67
98
68
99
## conda-build 3 and the new compiler toolchain
69
100
70
-
There was a period of time where conda-forge and Continuum worked together closely, with conda-forge relying on Continuum to supply several core libraries. In its infancy, the `conda-forge` channel had far fewer packages than the `defaults` channel. conda-forge's reliance on `defaults` was partly to lower conda-forge's maintenance burden and reduce duplicate work, but it also helped keep mixtures of conda-forge and `defaults` channel packages working by reducing possibility of divergence. Just as there were binary compatibility issues with mixing packages from among the many Binstar channels, mixing packages from `defaults` with `conda-forge` could be fragile and frustrating.
71
-
72
-
Around this point in time, [GCC 5 arrived][gcc-5] with a breaking change in `libstdc++`. These changes, among other compiler updates, began to make the CentOS 5 toolchain troublesome. Cutting edge packages, such as the nascent TensorFlow project, required cumbersome patching to work with the older toolchain, if they worked at all.
73
-
74
-
There was strong pressure from the community to update the ecosystem (i.e. the toolchain, and implicitly everything built with it). There were two prevailing options. One was Red Hat's `devtoolset`. This used an older GCC version which statically linked the newer `libstdc++` parts into binaries, so that `libstdc++` updates were not necessary on end user systems. The other was to build GCC ourselves, and to ship the newer `libstdc++` library as a conda package. This was a community decision, and it was split roughly down the middle.
75
-
76
-
In the end, the community decided to take the latter route, for the sake of greater control over updating to the latest toolchains, instead of having to rely on Red Hat. One major advantage of providing our own toolchain was that we could provide the toolchain as a conda package instead of a system dependency, so we could now express toolchain requirements in our recipes and have better control over compiler flags and behavior.
77
-
78
-
The result of this overhaul crystallized in the `compiler(...)` Jinja function in `conda-build` 3.x and the publication of the GCC 7 toolchain built from source in `defaults`[^anaconda-compilers].
101
+
There was a period of time where conda-forge and Continuum worked together
102
+
closely, with conda-forge relying on Continuum to supply several core libraries.
103
+
In its infancy, the `conda-forge` channel had far fewer packages than the
104
+
`defaults` channel. conda-forge's reliance on `defaults` was partly to lower
105
+
conda-forge's maintenance burden and reduce duplicate work, but it also helped
106
+
keep mixtures of conda-forge and `defaults` channel packages working by reducing
107
+
possibility of divergence. Just as there were binary compatibility issues with
108
+
mixing packages from among the many Binstar channels, mixing packages from
109
+
`defaults` with `conda-forge` could be fragile and frustrating.
110
+
111
+
Around this point in time, [GCC 5 arrived][gcc-5] with a breaking change in
112
+
`libstdc++`. These changes, among other compiler updates, began to make the
113
+
CentOS 5 toolchain troublesome. Cutting edge packages, such as the nascent
114
+
TensorFlow project, required cumbersome patching to work with the older
115
+
toolchain, if they worked at all.
116
+
117
+
There was strong pressure from the community to update the ecosystem (i.e. the
118
+
toolchain, and implicitly everything built with it). There were two prevailing
119
+
options. One was Red Hat's `devtoolset`. This used an older GCC version which
120
+
statically linked the newer `libstdc++` parts into binaries, so that `libstdc++`
121
+
updates were not necessary on end user systems. The other was to build GCC
122
+
ourselves, and to ship the newer `libstdc++` library as a conda package. This
123
+
was a community decision, and it was split roughly down the middle.
124
+
125
+
In the end, the community decided to take the latter route, for the sake of
126
+
greater control over updating to the latest toolchains, instead of having to
127
+
rely on Red Hat. One major advantage of providing our own toolchain was that we
128
+
could provide the toolchain as a conda package instead of a system dependency,
129
+
so we could now express toolchain requirements in our recipes and have better
130
+
control over compiler flags and behavior.
131
+
132
+
The result of this overhaul crystallized in the `compiler(...)` Jinja function
133
+
in [`conda-build` 3.x](conda-build-3) and the publication of the GCC 7 toolchain
134
+
built from source in `defaults`[^anaconda-compilers]. Conda-build 3 also
135
+
introduced dynamic pinning expressions that made it easier to maintain
136
+
compatibility boundaries. ABI documentation from [ABI Laboratory][abilab] helped
137
+
establish whether a given package should be pinned to major, minor, or bugfix
138
+
versions.
79
139
80
140
## From `free` to `main`
81
141
82
142
Here around 2017, Continuum renamed itself to Anaconda, so let's switch those names from here out.
83
143
84
-
As more and more conflicts with `free` channel packages occurred, conda-forge gradually added more and more of their own core dependency packages to avoid those breakages. At the same time, Anaconda was working on two contracts that would prove revolutionary.
85
-
86
-
- Samsung wanted to use conda packages to manage their internal toolchains, and Ray suggested that this was complementary to our own internal needs to update our toolchain. Samsung's contract supported development to conda-build that greatly expanded its ability to support explicit variants of recipes.
87
-
- Intel was working on developing their own Python distribution at the time, which they based on Anaconda and added their accelerated math libraries and patches to. Part of the Intel contract was that Anaconda would move all of their internal recipes into public-facing GitHub repositories.
88
-
89
-
Rather than putting another set of repositories (another set of changes to merge) in between internal and external sources, such as `conda-forge`, Michael and Ray pushed for a design where conda-forge would be the reference source of recipes. Anaconda would only carry local changes if they were not able to be incorporated into the conda-forge recipe for social, licensing, or technical reasons. The combination of these conda-forge based recipes and the new toolchain are what made up the `main` channel [^anaconda-5], which was also part of `defaults`.
90
-
91
-
The `main` channel represented a major step forward in keeping conda-forge and Anaconda aligned, which equates to smooth operation and happy users. The joined recipe base and toolchain has sometimes been contentious, with conda-forge wanting to move faster than Anaconda or vice-versa. The end result has been a compromise between cutting-edge development and slower enterprise-focused development.
144
+
As more and more conflicts with `free` channel packages occurred, conda-forge
145
+
gradually added more and more of their own core dependency packages to avoid
146
+
those breakages. At the same time, Anaconda was working on two contracts that
147
+
would prove revolutionary.
148
+
149
+
- Samsung wanted to use conda packages to manage their internal toolchains, and
150
+
Ray suggested that this was complementary to our own internal needs to update
151
+
our toolchain. Samsung's contract supported development to conda-build that
152
+
greatly expanded its ability to support explicit variants of recipes. This
153
+
became the major new feature set released in conda-build 3.x.
154
+
- Intel was working on developing their own Python distribution at the time,
155
+
which they based on Anaconda and added their accelerated math libraries and
156
+
patches to. Part of the Intel contract was that Anaconda would move all of their
157
+
internal recipes into public-facing GitHub repositories.
158
+
159
+
Rather than putting another set of repositories (another set of changes to
160
+
merge) in between internal and external sources, such as `conda-forge`, Michael
161
+
and Ray pushed for a design where conda-forge would be the reference source of
162
+
recipes. Anaconda would only carry local changes if they were not able to be
163
+
incorporated into the conda-forge recipe for social, licensing, or technical
164
+
reasons. The combination of these conda-forge based recipes and the new
165
+
toolchain are what made up the `main` channel [^anaconda-5], which was also part
166
+
of `defaults`.
167
+
168
+
The `main` channel represented a major step forward in keeping conda-forge and
169
+
Anaconda aligned, which equates to smooth operation and happy users. The joined
170
+
recipe base and toolchain has sometimes been contentious, with conda-forge
171
+
wanting to move faster than Anaconda or vice-versa. The end result has been a
172
+
compromise between cutting-edge development and slower enterprise-focused
173
+
development.
92
174
93
175
<!-- miniforge -->
94
176
@@ -98,6 +180,8 @@ The `main` channel represented a major step forward in keeping conda-forge and A
@@ -124,6 +208,8 @@ The `main` channel represented a major step forward in keeping conda-forge and A
124
208
125
209
[^chatting-ocefpaf]: [Filipe Fernandes on the Evolution of conda-forge](https://www.youtube.com/watch?v=U2oa_RLbTVA), Chatting with the Conda Community #1, 2024.
0 commit comments