Skip to content
Snippets Groups Projects
  1. Mar 17, 2016
  2. Feb 25, 2016
  3. Nov 24, 2015
  4. Nov 09, 2015
  5. Nov 04, 2015
  6. Nov 02, 2015
  7. Oct 26, 2015
  8. Oct 20, 2015
    • Vermaat's avatar
      Cache transcript protein links in Redis · 473c732c
      Vermaat authored
      Caching of transcript protein links received from the NCBI Entrez
      service is a typical use case for Redis. This implements this cache
      in Redis and removes all use of our original database table.
      
      An Alembic migration copies all existing links from the database to
      Redis. The original `TranscriptProteinLink` database table is not
      dropped. This will be done in a future migration to ensure running
      processes don't error and to provide a rollback scenario.
      
      We also remove the expiration of links (originally defaulting to 30
      days), since we don't expect them to ever change. Negative links
      (caching a 'not found' result from Entrez) *are* still expiring,
      but with a longer default of 30 days (was 5 days).
      
      The configuration setting for the latter was renamed, yielding the
      following changes in the default configuration settings.
      
      Removed default settings:
      
          # Expiration time for transcript<->protein links from the NCBI (in seconds).
          PROTEIN_LINK_EXPIRATION = 60 * 60 * 24 * 30
      
          # Expiration time for negative transcript<->protein links from the NCBI (in
          # seconds).
          NEGATIVE_PROTEIN_LINK_EXPIRATION = 60 * 60 * 24 * 5
      
      Added default setting:
      
          # Cache expiration time for negative transcript<->protein links from the NCBI
          # (in seconds).
          NEGATIVE_LINK_CACHE_EXPIRATION = 60 * 60 * 24 * 30
      473c732c
  9. Oct 13, 2015
  10. Oct 10, 2015
  11. Sep 30, 2015
  12. Sep 27, 2015
    • Vermaat's avatar
      Bi-directional cachinig of transcript-protein links · 8bbbc3a8
      Vermaat authored
      Previously transcript-protein links were assumed to always be
      indexed by transcript, and cached entries were allowed to have
      a `null` protein (meaning caching the knowledget that there is
      no link for this transcript).
      
      Now we can cache links in both directions. Both transcript and
      protein are allowed to be `null` (but not at the same time),
      and the protein column has a new unique constraint.
      8bbbc3a8
  13. Sep 11, 2015
  14. Sep 07, 2015
  15. Aug 11, 2015
  16. Aug 10, 2015
  17. Jul 16, 2015
  18. Jul 03, 2015
    • Vermaat's avatar
      Use chardet instead of cchardet · dedad241
      Vermaat authored
      Issue #50 showed a problem in our file encoding detection, caused
      by our cut-off for the confidence as reported by the cchardet [1]
      library:
      
          >>> import cchardet
          >>> s = u'NM_000052.4:c.2407\u20132A>G'
          >>> b = s.encode('WINDOWS-1252')
          >>> cchardet.detect(b)
          {'confidence': 0.5, 'encoding': u'WINDOWS-1252'}
      
      We require a confidence stictly greater than 0.5 and default to
      UTF8 otherwise.
      
      If, however, we try the same thing using the chardet [2] library,
      we get a higher confidence for the same string:
      
          >>> import chardet
          >>> chardet.detect(b)
          {'confidence': 0.73, 'encoding': 'windows-1252'}
      
      So the two obvious ways to solve this are:
      
      1. Lower the confidence threshold.
      2. Use chardet instead of cchardet.
      
      We implement the second solution here, since it also removes a C
      library dependency and we are not worried by performance.
      
      Of course the detected encoding remains a guess which can still
      be wrong!
      
      [1] https://github.com/PyYoshi/cChardet
      [2] https://github.com/chardet/chardet
      
      Fixes #50
      dedad241
  19. May 01, 2015
  20. Apr 30, 2015
  21. Jan 30, 2015
  22. Dec 16, 2014
  23. Dec 11, 2014
  24. Dec 10, 2014
    • Vermaat's avatar
      Compilation of developer documentation on Read The Docs · 667f8799
      Vermaat authored
      We use a separate `doc/requirements.txt` file for building the
      documentation. Since on Read The Docs where we cannot install
      packages depending on C libraries, we mock these packages in
      `doc/conf.py` and omit them from the requirements.
      667f8799
  25. Dec 08, 2014
  26. Nov 04, 2014
  27. Oct 20, 2014
  28. Sep 06, 2014
  29. Aug 27, 2014
  30. Aug 26, 2014
Loading