Issue
I want to print some floating point numbers so that they're always written in decimal form (e.g. 12345000000000000000000.0
or 0.000000000000012345
, not in scientific notation, yet I'd want to the result to have the up to ~15.7 significant figures of a IEEE 754 double, and no more.
What I want is ideally so that the result is the shortest string in positional decimal format that still results in the same value when converted to a float
.
It is well-known that the repr
of a float
is written in scientific notation if the exponent is greater than 15, or less than -4:
>>> n = 0.000000054321654321
>>> n
5.4321654321e-08 # scientific notation
If str
is used, the resulting string again is in scientific notation:
>>> str(n)
'5.4321654321e-08'
It has been suggested that I can use format
with f
flag and sufficient precision to get rid of the scientific notation:
>>> format(0.00000005, '.20f')
'0.00000005000000000000'
It works for that number, though it has some extra trailing zeroes. But then the same format fails for .1
, which gives decimal digits beyond the actual machine precision of float:
>>> format(0.1, '.20f')
'0.10000000000000000555'
And if my number is 4.5678e-20
, using .20f
would still lose relative precision:
>>> format(4.5678e-20, '.20f')
'0.00000000000000000005'
Thus these approaches do not match my requirements.
This leads to the question: what is the easiest and also well-performing way to print arbitrary floating point number in decimal format, having the same digits as in repr(n)
(or str(n)
on Python 3), but always using the decimal format, not the scientific notation.
That is, a function or operation that for example converts the float value 0.00000005
to string '0.00000005'
; 0.1
to '0.1'
; 420000000000000000.0
to '420000000000000000.0'
or 420000000000000000
and formats the float value -4.5678e-5
as '-0.000045678'
.
After the bounty period: It seems that there are at least 2 viable approaches, as Karin demonstrated that using string manipulation one can achieve significant speed boost compared to my initial algorithm on Python 2.
Thus,
- If performance is important and Python 2 compatibility is required; or if the
decimal
module cannot be used for some reason, then Karin's approach using string manipulation is the way to do it. - On Python 3, my somewhat shorter code will also be faster.
Since I am primarily developing on Python 3, I will accept my own answer, and shall award Karin the bounty.
Solution
Unfortunately it seems that not even the new-style formatting with float.__format__
supports this. The default formatting of float
s is the same as with repr
; and with f
flag there are 6 fractional digits by default:
>>> format(0.0000000005, 'f')
'0.000000'
However there is a hack to get the desired result - not the fastest one, but relatively simple:
- first the float is converted to a string using
str()
orrepr()
- then a new
Decimal
instance is created from that string. Decimal.__format__
supportsf
flag which gives the desired result, and, unlikefloat
s it prints the actual precision instead of default precision.
Thus we can make a simple utility function float_to_str
:
import decimal
# create a new context for this task
ctx = decimal.Context()
# 20 digits should be enough for everyone :D
ctx.prec = 20
def float_to_str(f):
"""
Convert the given float to a string,
without resorting to scientific notation
"""
d1 = ctx.create_decimal(repr(f))
return format(d1, 'f')
Care must be taken to not use the global decimal context, so a new context is constructed for this function. This is the fastest way; another way would be to use decimal.local_context
but it would be slower, creating a new thread-local context and a context manager for each conversion.
This function now returns the string with all possible digits from mantissa, rounded to the shortest equivalent representation:
>>> float_to_str(0.1)
'0.1'
>>> float_to_str(0.00000005)
'0.00000005'
>>> float_to_str(420000000000000000.0)
'420000000000000000'
>>> float_to_str(0.000000000123123123123123123123)
'0.00000000012312312312312313'
The last result is rounded at the last digit
As @Karin noted, float_to_str(420000000000000000.0)
does not strictly match the format expected; it returns 420000000000000000
without trailing .0
.
Answered By - Antti Haapala
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.