Commit 57bbb4c
committed
[SPARK-44290][CONNECT] Session-based files and archives in Spark Connect
### Previous behaviour
Previously, we kept `JobArtifactSet` and leveraged thread local for each client.
1. The execution block is wrapped with `SessionHolder.withSession` [here](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala#L53).
2. `SessionHolder.withContextClassLoader` is then called [here](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala#L130) which in turn calls `JobArtifactSet.withActive` [here](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SessionHolder.scala#L118) and sets the active set to `SessionHolder.connectJobArtifactSet`
3. The actual `JobArtifactSet` that is used is built up [here](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/artifact/SparkConnectArtifactManager.scala#L157) in `SparkConnectArtifactManager.jobArtifactSet`
Since each client has their own `JobArtifactSet` made `active` when executing an operation, the `TaskDescription` would have artifacts specific to that client and subsequently, `IsolatedSessionState` in Executor. Therefore, we were able to separate the Spark Connect specific logic to the Spark Connect module.
### Problem
Mainly it was all good; however, the problem is that we don't call `SparkContext.addFile` or `SparkContext.addJar`, but we just pass it directly at the scheduler (to `TaskDescription`). This is fine in general but exposes several problems by not directly calling `SparkContext.addFile`:
- `SparkContext.postEnvironmentUpdate` is not invoked at `SparkContext.addFile` which matters in, for example, recording the events for History Server.
- Specifically for archives, `Utils.unpack(source, dest)` is not invoked at `SparkContext.addFile` in order to untar properly in the Driver.
Therefore, we should duplicate those logics in Spark Connect server side, which is not ideal.
In addition, we already added the isolation logic into the Executor. Driver and Executor are the symmetry (not Spark Connect Server <> Executor). Therefore, it matters about code readability, and expectation in their roles.
### Solution in this PR
This PR proposes to support session-based files and archives in Spark Connect. This PR leverages the basework for #41701 and #41625 (for jars in Spark Connect Scala client).
The changed logic is as follows:
- Keep the session UUID, and Spark Connect Server specific information such as REPL class path within a thread local.
- Add session ID when we add files or archives. `SparkContext` keeps them with a map `Map(session -> Map(file and timestamp))` in order to reuse the existing logic to address the problem mentioned
After that, on executor side,
- Executors create additional directory, named by session UUID, on the top of the default directory (that is the current working directory, see `SparkFiles.getRootDirectory`).
- When we execute Python workers, it sets the current working directory to the one created above.
- End users access to these files via using the current working directory e.g., `./blahblah.txt` in their Python UDF. Therefore, compatible with/without Spark Connect.
Note that:
- Here it creates Python workers for individual session because we set the session UUID as an environment variable, and we create new Python workers if environment variables are different, see also `SparkEnv.createPythonWorker`
- It already kills the daemon and Python workers if they are not used for a while.
### TODOs and limitations
Executor also maintains the file list but with a cache so it can evict the cache. However, it has a problem - It works as follows:
- New `IsolatedSessionState` is created.
- Task is executed once, and `IsolatedSessionState` holds the file list.
- Later `IsolatedSessionState` is evicted at https://github.com/apache/spark/pull/41625/files#diff-d7a989c491f3cb77cca02c701496a9e2a3443f70af73b0d1ab0899239f3a789dR187
- Executor will create a new `IsolatedSessionState` with empty file lists.
- Executor will attempt to redownload and overwrite the files (see https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L1058-L1064)
- `spark.files.overwrite` is `false` by default. So the task will suddenly fail at this point.
Possible solutions are:
- For 1., we should maintain a cache with TTL, and remove them
- For 2. we should have a dedicated directory (which this PR does) and remove the directory away when the cache is evicted. So the overwrite does not happen
### Why are the changes needed?
In order to allow session-based artifact control and multi tenancy.
### Does this PR introduce _any_ user-facing change?
Yes, this PR now allows multiple sessions to have their own space. For example, session A and session B can add a file in the same name. Previously this was not possible.
### How was this patch tested?
Unittests were added.
Closes #41495 from HyukjinKwon/session-base-exec-dir.
Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>1 parent e4a356a commit 57bbb4c
File tree
26 files changed
+326
-249
lines changed- connector/connect/server/src
- main/scala/org/apache/spark/sql/connect
- artifact
- service
- test/scala/org/apache/spark/sql/connect/artifact
- core/src
- main/scala/org/apache/spark
- api/python
- executor
- scheduler
- test/scala/org/apache/spark
- executor
- scheduler
- python/pyspark/sql/tests/connect/client
- resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos
- sql
- core/src/main/scala/org/apache/spark/sql/internal
- hive/src/test/scala/org/apache/spark/sql/hive/execution
26 files changed
+326
-249
lines changedLines changed: 30 additions & 25 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
32 | | - | |
| 32 | + | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | | - | |
60 | 59 | | |
61 | 60 | | |
62 | 61 | | |
63 | 62 | | |
64 | 63 | | |
| 64 | + | |
| 65 | + | |
65 | 66 | | |
66 | 67 | | |
67 | | - | |
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
| 135 | + | |
| 136 | + | |
135 | 137 | | |
136 | | - | |
137 | 138 | | |
138 | | - | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
139 | 145 | | |
140 | 146 | | |
141 | 147 | | |
| |||
144 | 150 | | |
145 | 151 | | |
146 | 152 | | |
147 | | - | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
148 | 159 | | |
149 | | - | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
150 | 166 | | |
151 | 167 | | |
152 | 168 | | |
153 | 169 | | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | 170 | | |
172 | 171 | | |
173 | 172 | | |
174 | 173 | | |
175 | | - | |
| 174 | + | |
176 | 175 | | |
177 | 176 | | |
178 | 177 | | |
| |||
183 | 182 | | |
184 | 183 | | |
185 | 184 | | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
186 | 191 | | |
187 | 192 | | |
188 | 193 | | |
| |||
Lines changed: 1 addition & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
88 | 88 | | |
89 | 89 | | |
90 | 90 | | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | 91 | | |
97 | 92 | | |
98 | 93 | | |
| |||
114 | 109 | | |
115 | 110 | | |
116 | 111 | | |
117 | | - | |
118 | | - | |
| 112 | + | |
119 | 113 | | |
120 | 114 | | |
121 | 115 | | |
| |||
Lines changed: 0 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | 54 | | |
69 | 55 | | |
70 | 56 | | |
| |||
Lines changed: 59 additions & 64 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
23 | 33 | | |
24 | 34 | | |
25 | 35 | | |
26 | 36 | | |
27 | 37 | | |
28 | 38 | | |
29 | 39 | | |
30 | | - | |
31 | | - | |
| 40 | + | |
32 | 41 | | |
33 | 42 | | |
34 | 43 | | |
35 | 44 | | |
36 | | - | |
37 | | - | |
38 | | - | |
| 45 | + | |
| 46 | + | |
39 | 47 | | |
40 | 48 | | |
41 | 49 | | |
42 | | - | |
43 | | - | |
44 | 50 | | |
45 | | - | |
| 51 | + | |
46 | 52 | | |
47 | 53 | | |
48 | 54 | | |
49 | 55 | | |
50 | 56 | | |
51 | | - | |
52 | | - | |
| 57 | + | |
| 58 | + | |
53 | 59 | | |
54 | 60 | | |
55 | 61 | | |
56 | | - | |
57 | 62 | | |
58 | 63 | | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | 64 | | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
86 | 72 | | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
91 | 77 | | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
| 78 | + | |
99 | 79 | | |
100 | 80 | | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
105 | 87 | | |
106 | | - | |
107 | | - | |
108 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
109 | 91 | | |
110 | | - | |
| 92 | + | |
111 | 93 | | |
112 | 94 | | |
113 | 95 | | |
114 | 96 | | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
122 | 103 | | |
123 | | - | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
124 | 119 | | |
0 commit comments