Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
M
mutalyzer
Manage
Activity
Members
Code
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Analyze
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Mirrors
mutalyzer
Commits
57ec1d74
Commit
57ec1d74
authored
9 years ago
by
Vermaat
Browse files
Options
Downloads
Patches
Plain Diff
Don't trust encoding auto-detection when decoding
parent
71555028
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
mutalyzer/File.py
+15
-3
15 additions, 3 deletions
mutalyzer/File.py
mutalyzer/Retriever.py
+9
-2
9 additions, 2 deletions
mutalyzer/Retriever.py
with
24 additions
and
5 deletions
mutalyzer/File.py
+
15
−
3
View file @
57ec1d74
...
...
@@ -170,7 +170,13 @@ class File() :
handle
=
_UniversalNewlinesByteStreamIter
(
handle
,
encoding
=
encoding
,
buffer_size
=
BUFFER_SIZE
)
buf
=
handle
.
read
(
BUFFER_SIZE
)
try
:
buf
=
handle
.
read
(
BUFFER_SIZE
)
except
UnicodeDecodeError
:
self
.
__output
.
addMessage
(
__file__
,
3
,
'
EBPARSE
'
,
'
Could not decode file (using %s encoding).
'
%
encoding
)
return
None
# Default dialect
dialect
=
'
excel
'
...
...
@@ -196,8 +202,14 @@ class File() :
reader
=
csv
.
reader
(
handle
,
dialect
)
ret
=
[]
for
i
in
reader
:
ret
.
append
([
c
.
decode
(
'
utf-8
'
)
for
c
in
i
])
try
:
for
i
in
reader
:
ret
.
append
([
c
.
decode
(
'
utf-8
'
)
for
c
in
i
])
except
UnicodeDecodeError
:
self
.
__output
.
addMessage
(
__file__
,
3
,
'
EBPARSE
'
,
'
Could not decode file (using %s encoding).
'
%
encoding
)
return
None
return
ret
#__parseCsvFile
...
...
This diff is collapsed.
Click to expand it.
mutalyzer/Retriever.py
+
9
−
2
View file @
57ec1d74
...
...
@@ -114,7 +114,13 @@ class Retriever(object) :
encoding
=
'
utf-8
'
if
not
util
.
is_utf8_alias
(
encoding
):
raw_data
=
raw_data
.
decode
(
encoding
).
encode
(
'
utf-8
'
)
try
:
raw_data
=
raw_data
.
decode
(
encoding
).
encode
(
'
utf-8
'
)
except
UnicodeDecodeError
:
self
.
_output
.
addMessage
(
__file__
,
4
,
'
ENOPARSE
'
,
'
Could not decode file (using %s encoding).
'
%
encoding
)
return
None
# Compress the data to save disk space.
comp
=
bz2
.
BZ2Compressor
()
...
...
@@ -368,7 +374,8 @@ class GenBankRetriever(Retriever):
"
number to reduce downloading overhead.
"
%
unicode
(
record
.
id
))
#if
self
.
_write
(
raw_data
,
outfile
)
if
not
self
.
_write
(
raw_data
,
outfile
):
return
None
return
outfile
,
GI
#write
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment