Description
The current NCA implementation creates a numpy matrix (tmp) of shape (n, n, d, d). It is problem even with medium size datasets(memory requirement n².d²). On my computer I have 64Go of memory, and I cannot run NCA on a dataset of 1021 instances in dimension 122.
A simple solution to deal with this memory problem is to compute tmp_i at each iteration:
tmp_i = np.einsum('...i,...j->...ij', dX[i], dX[i]) # shape (n, d, d)
This way, the memory requirement in only (n . d²), and it works on the dataset of dimension (1021, 122). However, it requires more than 12 hours to compute !!
This solution increases the execution time as tmp_i is recomputed at each iteration :
Shape of the input data | Previous execution time | New execution time
(44, 122) | 45s | 52s
(198, 122) | 20min | 25min
(1021, 122) | Memory Error | >12h
Even for small datasets the execution time is very long. Do you have any idea on how improve the performance of NCA ?
- Can we use an optimization library such as scipy.optimize (used for the implementation of MLKR for instance) ? Do you think it would significantly reduce the execution time ?
- Another solution which seems more complex is to re-implementing the optimization in cython.