Skip to content

Fixed ICU extraction for multiple processes. #140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ph4r05
Copy link

@ph4r05 ph4r05 commented Sep 15, 2014

Fixes segfault in the following scenario: two Android processes launched
during application start tried to extract ICU file at the same time
overwriting destination file by each other. This results in segmentation
fault while loading ICU in native library.

The segmentation fault ocurred in checkDataItem function at external/icu4c/common/udata.cpp. In particular on line with "pHeader->dataHeader.magic1==0xda".

How to reproduce a problem:

  • Create 2 independent processes in android application, e.g., two services running in different process.
  • Start both services on application start.
  • Open (different) SQLite databases in both processes.
  • Tested on Android KitKat (API 19), Motorola Moto G.

This proposed patch fixes the problem by:

  1. using file locking so only one process extracts ZIP archive and writes the destination in time;
  2. extracting to a temporary file, then renaming it to the destination file name;
  3. forcing flushing/syncing file to file system using channels and sync so native library can process it right after the extraction routine finishes.

Fixes segfault in the following scenario: two Android processes launched
during application start tried to extract ICU file at the same time
overwriting destination file by each other. This results in segmentation
fault while loading ICU in native library.
There may be problems with file locking, so extract ICU anyway.
@developernotes
Copy link
Member

Hello @ph4r05

Thank you for submitting a pull request. After reviewing the scenario and submitted code, I'm curious if you think this could be pushed up into the application codebase instead of the library itself? This specific race condition has never been reported before. We welcome your feedback, thanks!

@brody4hire
Copy link

@ph4r05 I would ask a different question: why do we have the clash (if /system/usr/icu/icudt46l.dat does not exist) in the first place? I suggest we start to log the path values and investigate. I can take a look this coming weekend, no promises though.

@ph4r05
Copy link
Author

ph4r05 commented Sep 18, 2014

@developernotes Regarding the application codebase. It is hard to say for me since I am using SQLCipher for Android at the moment (no other platforms) and I am not aware of the other affected platforms by this issue. I would say it is platform specific problem.

@brodybits In my opinion it is not a path clash literally. In my scenario it is the same application, but two different processes in the given app uses sqlcipher in parallel. It makes sense to have only one copy of icudt46l.dat for the whole application for me, processes can share it, but only one can create it. If I understand you correctly.

@developernotes
Copy link
Member

Hi @ph4r05

Regarding the application codebase. It is hard to say for me since I am using SQLCipher for Android at the moment (no other platforms) and I am not aware of the other affected platforms by this issue.

In this case, I mean pushing the locking check within your application itself during your call to SQLiteDatabase.loadLibs(…);, as appose to the Java-based client library of SQLCipher for Android? This could be shared logic between the two services.

@brody4hire
Copy link

@ph4r05 I have the following points. It would help the others if you could clearly describe the "use case" of what you need to start multiple processes for.

The old version of loadICUData() would only extract icudt46l.dat from zip if the file does not already exist and it looks like you preserved that functionality. However, if icudt46l.dat does exist after you acquire the lock, it looks like you do not release the lock before returning.

I am also wondering if we need the temporary file while extracting icudt46l.dat, now that we are using a lock?

@ph4r05
Copy link
Author

ph4r05 commented Sep 18, 2014

@developernotes Yes this also makes perfect sense. It could be done in this way. I just wanted to share my problem & solution because investigation was not that straightforward, it just segfaulted without any obvious reason. But application itself "does not know" that there happens ZIP extraction in the background under some circumstances (e.g., system wide file is not present). In this case I would suggest extracting this initialization functionality to a separate method.

@brodybits The lock is eventually released in finally block. Use case is for example this:

  1. process is a master service handling requests from the application. But service is using JNI code a lot and from time to time there may be a segfault (as in this case). so here I have a second process:
  2. watchdog process. Is isolated from segfaults that may happen in master service. But also uses sqlcipher storage for its own purpose. It can backup the master service and help to recover important state information. This design proved to work well if a native code is not 100% bugfree.

Regarding the temporary files - It was my first solution before I implemented the locking. I am not sure whether the locking support on file systems is 100% reliable on all android platforms and FS (e.g., not provided at all). It is just another partial solution to the problem if the lock implementation does not work as expected (some weird extreme cases).

But assuming the lock mechanism is not working correctly, the icuLockAcquire function would have to be redesigned because of the try block in while loop. This try block handles EAGAIN exceptions (another process holds this lock). I admit the catch block is too general, but EAGAIN is wrapped in IOException, as far as I remember so more complex error handling code would be required.

Code could be simplified - temporary file is not required if locking works as expected.

UPDATE: I hope my point is understandable. The old loadICUData() does the extraction if file does not exist, yes. But this whole functionality is not atomic. If two processes start approximately in the same time start this check, both would pass the exists() condition, because unziping takes some time. In that time they end up by rewriting each others file.

@brody4hire
Copy link

service is using JNI code a lot and from time to time there may be a
segfault (as in this case). so here I have a second process:

This doesn't sound very good. Can we investigate this more to
determine what is causing the crashes?

@ph4r05
Copy link
Author

ph4r05 commented Sep 18, 2014

@brodybits Probably we misunderstood each other. My application is using plenty of other native code, not only yours (sqlcipher). SQLCipher itself works fine, without segfaults. Just the first start of the application caused segfault because of the ZIP extraction overwrite problem. We diverted a bit from the original point - SQLCipher segfault in my scenario.

@brody4hire
Copy link

@ph4r05 first let me make a very important correction: sqlcipher is not "my" code at all! I did make a few contributions over the past couple years and have been a participant, nothing more. I have integrated both SQLite and SQLCipher with a Cordova/PhoneGap plugin, which is what actually triggered my level of involvement. I also contributed a couple fixes to external/Android.mk to get the ICU integration actually working (see #134). Credit goes to @developernotes, @sjlombardo, and others at https://github.com/orgs/sqlcipher/people.

@ph4r05
Copy link
Author

ph4r05 commented Sep 18, 2014

@brodybits OK sure. I was not specific enough, sorry for that.

@brody4hire
Copy link

@ph4r05 no worries then. So I find the first step, which is to identify the actual problem/issue and requirements to be well-done.

In your response to @developernotes I like the idea to load the ICU data file in a separate function and am wondering if we should make this optional (considering that it wasn't really working right in the past)?

@brody4hire
Copy link

but to load the ICU data file in a separate function breaks the API and I think to make it optional would increase the complexity, if it is even possible. Simplest to just let the application do the locking on SQLiteDatabase.loadLibs(…) as suggested by @developernotes.

@ph4r05
Copy link
Author

ph4r05 commented Sep 18, 2014

@brodybits Breaking API is not a good thing, sure and separating ICU extraction to a separate method was probably not a best idea.

It depends on the point of view. In my case, I prefer using fix proposed in pull request in the sqlcipher.jar, instead of forcing process-wide synchronization in my services, because I consider the ZIP extraction as an implementation detail of the SQLCipher so I find it easier to fix one place that actually causes the error instead of using workaround on a multiple places (loadLibs).

But you can also say in documentation that loadLibs() performs extraction and is unsafe to run in from different processes simultaneously (I am not sure whether current documentation mentions this or not).

@developernotes
Copy link
Member

Hi @ph4r05

Circling back on this issue, we've made some changes around how ICU data is extracted. I'm curious if you continue to see errors with the code that is in master?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants