You can also use
while playermove not in ('paper', 'scissors', 'rock'):
...
You can also use
while playermove not in ('paper', 'scissors', 'rock'):
...
I'm using IDLE
Idle has 2 processes, the first one (say process A) to run the IDLE GUI, the second one (say process B) to run your python code. Restarting the shell in idle means to kill process B to start a new one.
This snippet only restarts the process where it is called, in our case, it restarts process B, but it won't restart Idle. As far as I know, there is no way to restart process A from your program because your python program has no notion that it is running in Idle.
The Idle process could be restarted programmatically by exploring the processes running on your computer and calling appropriate commands, but I don't think it would be very useful. Why do you want to restart Idle ?
Edit: actually, in idle it won't work well because the idle process A starts your process B with pipes to communicate with B, so that you can read data printed by your program in the idle console or send data to your program by writing in the idle console. Unfortunately, the pipes wont be enabled after restart_program()
, which means that your program restarts but can't communicate with the idle console. I'll try to design a small test program to show this.
The signature of the bytes()
function gives the solution
class bytes(object)
| bytes(iterable_of_ints) -> bytes
| bytes(string, encoding[, errors]) -> bytes
| bytes(bytes_or_buffer) -> immutable copy of bytes_or_buffer
| bytes(int) -> bytes object of size given by the parameter initialized with null bytes
| bytes() -> empty bytes object
...
You want to use the second form bytes(string, encoding)
but instead you wrote bytes((string, encoding))
, that is to say a single tuple argument. Since a tuple is iterable, the only option for python is to believe that you used the first form. Then python expects that your tuple is a tuple of ints, and as this fails, you get the error. So use
bytes(msg, 'utf-8')
Do you really mean a distributed denial of service attack ? Where would it be funny or cool ?
Great ! Then mark the thread as solved !
The flatten_dict()
function returns an iterable sequence of (key, value)
pairs. You can turn this to a list of pairs with list(flatten_dict(root))
.
A list of pairs L (or a iterable of pairs) can be transposed with zip(*L)
.
>>> L = [('A', 1), ('B', 2), ('C', 3), ('D', 4)]
>>> list(zip(*L))
[('A', 'B', 'C', 'D'), (1, 2, 3, 4)]
You don't understand the output of flatten_dict(). You can do this
def main():
with open('source.xml', 'r', encoding='utf-8') as f:
xml_string = f.read()
xml_string= xml_string.replace('�', '') #optional to remove ampersands.
root = ElementTree.XML(xml_string)
writer = csv.writer(open("test_out.csv", 'wt'))
writer.writerows(zip(*flatten_dict(root)))
if __name__ == "__main__":
main()
Also main() was not called in your code, due to indentation.
If there is still a dot at the end of a column header, it would be better to remove it ('Response.R.'
becomes 'Response.R'
). For this, use the rstrip('.')
method.
By default, the csv module selects the 'excel' dialect.
It is not better to use writerows()
. It is only shorter if you only want to call writerow()
several times. Shorter code looks better.
The problem is that I dont understand your rule for key generation. If you want to keep only the last word, it is very easy to do
for key, value in flatten_dict(root):
key = key.rstrip('.').rsplit('.', 1)[-1]
print(key, value)
edit: also, you can start with my generator and change the code the way you want to generate a different key.
It looks easy if you write pseudo-code
def fileExtensionExists(fileList, fileExtension):
for fileName in fileList:
get this filename''s extension, call this 'extension' (use parseExtension)
if extension == fileExtension:
return True
# if we reach here, the extension was not met
# it means that the extension does not exist in the list
return False
def parseExtension(fileName):
if there is a dot in filename:
return everything after the last dot
else:
return the empty string
Now you should be able to transform this into python.
Here is a variant which handles lists differently by enumerating the list items and removing their common tag. It works for MonthDayCount in your example, but I'm not sure it will work the way you want for all the lists in your files.
import xml.etree.cElementTree as ElementTree
from xml.etree.ElementTree import XMLParser
def flatten_list(aList, prefix=''):
for i, element in enumerate(aList, 1):
eprefix = "{}{}".format(prefix, i)
if element:
# treat like dict
if len(element) == 1 or element[0].tag != element[1].tag:
yield from flatten_dict(element, eprefix+'.')
# treat like list
elif element[0].tag == element[1].tag:
yield from flatten_list(element, eprefix+'.')
elif element.text:
text = element.text.strip()
if text:
yield eprefix, text
def flatten_dict(parent_element, prefix=''):
prefix = prefix + parent_element.tag + '.'
if parent_element.items():
for k, v in parent_element.items():
yield prefix + k, v
for element in parent_element:
eprefix = prefix + element.tag + '.'
if element:
# treat like dict - we assume that if the first two tags
# in a series are different, then they are all different.
if len(element) == 1 or element[0].tag != element[1].tag:
yield from flatten_dict(element, prefix=prefix)
# treat like list - we assume that if the first two tags
# in a series are the same, then the rest are the same.
else:
# here, we put the list in dictionary; the key is the
# tag name the list elements all share in common, and
# the value is the list itself
yield from flatten_list(element, prefix=eprefix)
# if the tag has attributes, add those to the dict
if element.items(): …
I transformed the activestate recipe into a generator which flattens the xml structure. Look at the ouput, then try to define how you would transform the keys to get correct column names
import xml.etree.cElementTree as ElementTree
from xml.etree.ElementTree import XMLParser
import json
import csv
import tokenize
import token
try:
from collections import OrderedDict
import json
except ImportError:
from ordereddict import OrderedDict
import simplejson as json
import itertools
import six
import string
#from csvkit import CSVKitWriter
def flatten_list(aList, prefix=''):
for element in aList:
if element:
# treat like dict
if len(element) == 1 or element[0].tag != element[1].tag:
yield from flatten_dict(element, prefix)
# treat like list
elif element[0].tag == element[1].tag:
yield from flatten_list(element, prefix)
elif element.text:
text = element.text.strip()
if text:
yield prefix, text
def flatten_dict(parent_element, prefix=''):
prefix = prefix + parent_element.tag + '.'
if parent_element.items():
for k, v in parent_element.items():
yield prefix + k, v
for element in parent_element:
eprefix = prefix + element.tag + '.'
if element:
# treat like dict - we assume that if the first two tags
# in a series are different, then they are all different.
if len(element) == 1 or element[0].tag != element[1].tag:
yield from flatten_dict(element, prefix=prefix)
# treat like list - we assume that if the first two tags
# in a series are the same, then the rest are the same.
else:
# here, we put the list in dictionary; the key is the
# tag name the list elements all share in common, and
# the value is the list …
I see that you are using an old activestate recipe to parse xml and transform it into a dictionary. It means that the comments in this code don't have anything to do with your specific xml files.
You could traverse the parsed xml tree directly and generate on the fly the items of your final flat dictionary. It would be much more hard-hitting. This would probably mean two or three generators invoking each other recursively through the use of yield from
statements.
The key algorithmic points remain the same: what is your actual xml file's structure and what are your rules to create target key, value pairs ?
Yes put the snippet in a file named postprocess.py
then write
from postprocess import post_process
Here is an example of what you can do. This function transforms a dictionary by exploding the inner lists if they contain only strings
from postprocess import post_process
@post_process(dict)
def explode_lists(adict):
for key, value in adict.items():
if isinstance(value, list):
if all(isinstance(x, str) for x in value):
for i, x in enumerate(value, 1):
yield ('{}{}'.format(key, i), x)
continue
yield key, value
if __name__ == '__main__':
D = {'D_B': ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'],
'F_Int32': ['0','0','0','0'],
'OTF': '0',
'PBDS_Double': ['0', '0', '0', '0', '0', '0', '0', '0'],
'SCS_String': ['1', '2']}
print(explode_lists(D))
""" my output -->
{'SCS_String2': '2', 'SCS_String1': '1', 'PBDS_Double1': '0', 'PBDS_Double3': '0', 'PBDS_Double2': '0', 'PBDS_Double5': '0', 'PBDS_Double4': '0', 'PBDS_Double7': '0', 'PBDS_Double6': '0', 'PBDS_Double8': '0', 'F_Int321': '0', 'F_Int323': '0', 'F_Int322': '0', 'F_Int324': '0', 'D_B5': '0', 'D_B4': '0', 'D_B7': '0', 'D_B6': '0', 'D_B1': '0', 'D_B3': '0', 'D_B2': '0', 'D_B9': '0', 'D_B8': '0', 'OTF': '0', 'D_B11': '0', 'D_B10': '0'}
"""
In this code, I used a very useful snippet that I wrote long ago in the following file postprocess.py
# postprocess.py
def post_process(*filters):
"""Decorator to post process a function's return value through a
sequence of filters (functions with a single argument).
Example:
@post_process(f1, f2, f3)
def f(*args, **kwd):
...
return value
then calling f(...) will actually return f3( f2( f1( f(...)))).
This can also be used to convert a generator to a function
returning a sequence type:
@post_process(dict)
def my_generator():
...
yield key, value
"""
def decorate(func):
from functools import wraps
@wraps(func)
def wrapper(*args, **kwd): …
Here is a simple transformation snippet
>>> key = 'spam'
>>> L = ['foo', 'bar', 'baz']
>>> [('{}{}'.format(key, i), value) for i, value in enumerate(L, 1)]
[('spam1', 'foo'), ('spam2', 'bar'), ('spam3', 'baz')]
the key is replicated n number of times for n number of items in its associated list
Every output is possible, only the rules which govern the production of items must be very carefuly defined. If you want DB1, DB2, etc, you must explain by which rule D_B
becomes DB1
, DB2
etc. In the same way, which rule transforms F_Int32
into F1
, F2
, etc.
Python can implement any precise rule that you define, but it cannot define the transformation rules for you.
Interestingly, with your function on other files I get the following error
I made the assumption that if the json file contains arrays with more than one item, all these items are strings. It fails in the files that you tried. Only strings can be concatenated. You must define the list of pairs that must be generated in these cases.
One final comment to make: Is it possible to write a function that applies to each key such that if the key is associated with a value, the key is replicated n number of times for n number of items in its associated list.
Again, this must be clarified with a json example and the complete list of pairs key/value that you want to produce with this json data.
Here is the kind of code that you can try
import csv
import json
import sys
def shrink(v):
while True:
if isinstance(v, list):
if len(v) == 1:
v = v[0]
elif v:
assert all(isinstance(x, str) for x in v)
v = ''.join(v)
else:
raise ValueError('Empty list')
elif isinstance(v, dict) and len(v) == 1:
v = next(iter(v.values()))
else:
return v
def flatten(obj):
assert isinstance(obj, dict)
for k, v in obj.items():
v = shrink(v)
if isinstance(v, dict):
yield from flatten(v)
else:
yield k, v
if __name__ == "__main__":
with open("data2.json") as f:
data = json.load(f)
pairs = list(flatten(data))
print(pairs)
writer = csv.writer(sys.stdout)
header = writer.writerow([k for k, v in pairs])
row = writer.writerow([v for k, v in pairs])
The idea is to shrink the values with the following rules: if a value is a list with a single element, it is replaced by this element (shrunk). If the value is a list with more than 1 element, it is assumed that all the elements are strings, and they are concatenated and the result replaces the value. If the value is a dict with a single key, it is replaced by the value of this single item.
My output with the modified json file above is
[('DLA', '0'), ('FC', '00000'), ('PC', '0'), ('WC', '0'), ('CN', None), ('Description', None), ('Code', '0'), ('CMC', '0')]
DLA,FC,PC,WC,CN,Description,Code,CMC
0,00000,0,0,,,0,0
It is because temp is a string (str
type) while 85 is an integer. Integer comparison is different from string comparison (string ordering is lexicographic order). You should do
temp = int('32')
to convert to int for example.
This problem does not exist in recent versions of python where comparison of unordered data types is impossible. See the difference between python 2.7
>>> '32' > 85
True
and python 3.4
>>> '32' > 85
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() > int()
If you have the choice, learn python with python 3.4.
Of course you must do a bit of parsing, for example here is how to generate the pairs in the above example (the code must be modified if the json file contains lists)
#!/usr/bin/env python3
# -*-coding: utf8-*-
'''demonstrates incremental json parsing
'''
import ijson
import io
values = set(['null', 'boolean', 'number', 'string'])
class UnexpectedSyntax(RuntimeError):
pass
def gen_pairs(jsonfile):
parser = ijson.parse(jsonfile)
p, e, v = next(parser)
if e != 'start_map':
raise UnexpectedSyntax
map_depth = 1
for p, e, v in parser:
if e == 'end_map':
map_depth -= 1
if map_depth == 0:
next(parser)
raise UnexpectedSyntax('Expected end of json source after map')
elif e == 'map_key':
key = v
p, e, v = next(parser)
if e in values:
yield key, v
elif e == 'start_map':
map_depth += 1
else:
raise UnexpectedSyntax
else:
raise UnexpectedSyntax
if map_depth > 0:
raise UnexpectedSyntax('Incomplete map in json source')
ifh = io.open('data.json', encoding='utf8')
for key, value in gen_pairs(ifh):
print(key, value)
""" my output -->
A 5
FEC 1/1/0001 12:00:00 AM
TE None
Locator None
Message Transfer Fee
AT None
FT None
FR True
FY None
FR None
FG 0
Comment None
FUD None
cID None
GEO None
bar baz
qux spam
ISO None
TRID None
XTY 931083
ANM None
NM None
CF Fee
ID 2
"""
@slavi
Yes, you need an incremental json parser. By googling a few minutes, I found ijson and its mother yajl-py. There may be others.
@saran_1
Using an incremental json parser, you could parse the file twice and write the first row, then the second row, even if the file is very large. Alternatively, you could write one file per row in one pass, then concatenate these files.
Once the text item is created, you can obtain a bounding box with canvas.bbox(item=text)
, or something similar.
Hm, you want to understand some basic things in programming, but the task of converting markdown to html is not such a basic thing. I would recommend it only to experienced programmers.
Such modules already exist: you can install the markdown module, then use its conversion function described here
import codecs
import markdown
input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8")
text = input_file.read()
html = markdown.markdown(text)
You could perhaps look in the source code of the markdown module to see how hard it is.
These commands remain dangerous even after they've been used, because they remain in your shell history, with a risk of being recalled mistakenly, for example a ^r in a bash terminal may easily recall commands. I wouldn't accept to have a rm -rdf *
in my shell history.
I would say rm -rf *
.
The code finds the lines containing the word 'python'
in the file named 'history'
. The search function yields the line containing that word and a deque of at most five lines immediately before that line.
For example if the file contains
foo
bar
baz
qux
spam
eggs
ham
my python
hello
the search function will yield
('my python\n', deque(['baz\n','qux\n','spam\n','eggs\n','ham\n']))
All right, then create a file foo.py with the following code
import datetime
getTime1 = "01:00PM"
t = datetime.datetime.strptime(getTime1, "%I:%M%p") + datetime.timedelta(days=36524)
time_half_hour_before = (t - datetime.timedelta(minutes=30)).strftime("%I:%M%p")
print time_half_hour_before
Then run it and post here everything that python says.
Edit: python 2.6 is already much better than python 2.4, so if you can upgrade, upgrade.
Unfortunately v is int or long
means
(v is int) or long
It is always True because bool(long
) is True. If you want to test if a value is an int or a long, the valid ways are
isinstance(v, (int, long)) # good programming
type(v) in (int, long) # doesn't work with subtypes
v.__class__ in (int, long) # dirty but faster
Here is a way. I replaced all the print statements by calls to a function printout()
. This function is built by using the print_function
feature which makes python 2 look like python 3. Note that you can still use print in other parts of the code but with a function syntax.
The from __future__
statement must appear at the very top of the program.
from __future__ import print_function
from functools import partial
def main2(ofile):
printout = partial(print, file=ofile)
with open ("LotData1.txt", "r") as file:
sideList = []
for i in file:
temp = i.strip().split()
sideList.append([temp[0], temp[1], float(temp[2])])
obj = Lot(sideList, "", "")
Departures = obj.getDepartureClosure()
Latitudes = obj.getLatitudeClosure()
Departures_Correction = obj.getDepartureCorrectedClosure()
Latitudes_Correction = obj.getLatitudeCorrectedClosure()
count = len(sideList)
printout("INPUT VALUES:\nCOURSE\tBEARING\t\tDISTANCE\t\tLATITUDE\tCORRECTION\tDEPARTURE\tCORRECTION")
for i in range(count):
printout("%s \t %s \t %+6.3f\t%+10.3f\t%+10.3f\t%+10.3f\t%+10.3f"%\
(sideList[i][0], sideList[i][1], sideList[i][2], obj.getLatitudeClosure()[i], Latitudes_Correction[i], Departures[i], Departures_Correction[i]))
AdjustedLatitudes, AdjustedDepartures, Adjusted_Distances = obj.adjustByCompassRule()
AdjustedBearings = obj.adjustBearings()
printout("\n\nADJUSTED VALUES:\nCOURSE\tLATITUDE\tDEPARTURE\tDISTANCE\tBEARING")
for i in range(count):
printout("%s\t%+10.3f\t%+10.3f\t%+10.3f\t%5d-%d-%d"%\
(sideList[i][0], AdjustedLatitudes[i], AdjustedDepartures[i], Adjusted_Distances[i],AdjustedBearings[0][i],AdjustedBearings[1][i],AdjustedBearings[2][i]))
Northings = obj.GetNorthings()
Eastings = obj.GetEastings()
printout("\n\nADJUSTED COORDINATES:\nCORNER\tNORTHING\tEASTING")
for i in range(count):
printout("%s\t%10.3f\t%10.3f" %\
(sideList[i][0][0:1],Northings[i-1],Eastings[i-1]))
LEC = obj.getLinearErrorOfClosure()
TotalDistance = obj.getTotalDistance()
Departure_Closure = sum(Departures)
Latitude_Closure = sum(Latitudes)
RelativePrecision = obj.getRelativePrecision()
LECBearing = obj.getLECBearingToString()
CorrLatClosure = obj.getLatCorrectedClosure()
CorrDepClosure = obj.getDepCorrectedClosure()
printout()
printout("OTHER STATISTICS:")
printout(" LATITUDE CLOSURE:", Latitude_Closure)
printout(" DEPARTURE CLOSURE:", Departure_Closure)
printout()
printout(" LINEAR ERROR OF CLOSURE (LEC):", LEC)
printout(" LEC BEARING: ", LECBearing)
printout()
printout(" TOTAL DISTANCE:", TotalDistance)
printout(" RELATIVE PRECISION: 1:", RelativePrecision)
printout(" CORRECTED LATITUDE CLOSURE:", CorrLatClosure)
printout(" CORRECTED DEPARTURE CLOSURE", CorrDepClosure)
def main():
with open('outputfile.txt', 'w') as ofile:
main2(ofile)
Bonus: if you …
If you input [z * 5 for z in range(2, 10, 2)]
, it does not interfere.
I think it works only with recent versions of python. For me, it works with python 3.4 but not with 2.7.
Edit: you can write
for rec in records:
tag, args = rec[0], rec[1:]
decorators only work on function call, not on definition
On the contrary, the decorator is called once, immediately after the function's definition
>>> def deco(f):
... print f.__name__, 'is being defined'
... return f
...
>>> @deco
... def foo(x):
... return x * x
...
foo is being defined
>>> foo(3)
9
>>>
Between the raw_input()
and your printing x
, there was an input()
. This is where x
was modified.
About the dangers of input()
, the point is that when a program writes write a list!
, the user does not think that his input can have a malicious effect such as erasing files or starting unwanted programs, etc. There should be some mechanism in your program to prevent this from happening. Python's eval()
function is too powerful, or it must be used in conjonction with code restricting the string that can be passed to input()
.
An example is given in this snippet where an expression is carefully analysed before being given to eval()
, thus allowing only the evaluation of mathematical expressions.
When [x * 3 for x in range(2, 10, 2)]
is evaluated as a result of the input function, the variable x
takes the values 2, 4, 6, 8
. When the x
is printed, its value is the last one, 8
. You can avoid this by using another variable name.
Conclusion: don't use input()
in python 2, it is too dangerous.
Here is my solution
def field(template, value):
res = 0
for t, v in zip(template, value)[::-1]:
if t.__class__ is int:
t = t.bit_length()
else:
t = int(t)
res = (res << t)|v
return res
if __name__ == '__main__':
res = field([0b1,0b111,0b1111], [0,3,9])
print(res, bin(res))
I'm afraid the indention didn't make it through the web editor. The best way to indent python code is with 4 space characters such as in
if x == 0:
print("hello") # <- see the 4 space at the beginning ?
It means that you must configure your editor so that it inserts 4 space characters when you hit the tab key. Which code editor are you using ? Look for the preference menu.
About your existing code, you can reindent it correctly by installing Tim Peters' reindent module. After installing reindent (pip install reindent
in a console window may work), you can type for example
reindent Secante.py
in a console window, and this will indent the Secante.py file with spaces.
DO NOT indent python code with tab characters '\t'
if you want to avoid indention issues.
All right. It seems strange to me that the bot2 folder is in C:\Python34 . You shouldn't clutter python's own directory with your modules under development.
You can try something such as
p = cfg.project
command = (
"{cuffdiff} -p {threads} -b {fasta} -u {merged} "
"-L {pheno0},{pheno1} -o {ofolder} {bam0} {bam1} {log}"
).format(
cuffdiff=cfg.tool_cmd("cuffdiff"),
threads=p["analysis"]["threads"],
fasta=p["genome"]["fasta"],
merged=p["experiment"]["merged"],
pheno0=p["phenotype"][0],
pheno1=p["phenotype"][1],
ofolder=output_folder,
bam0=p["samples"][0]["files"]["bam"],
bam1=p["samples"][1]["files"]["bam"],
log=p["analysis"]["log_file"]
)
print(command)
with your corrections. (for example, you may try to replace p["phenotype"][0]
with next(iter(p["phenotype"][0]))
).
touch foo
creates file foo
, if that's what you mean, but I probably missed something in your question. (?)
The problem is not with the plotting, there are very good libraries for plotting such as matplotlib or pyqtgraph or perhaps the gr framework, etc. The problem is rather that you're obtaining your numerical data from a R library (bioconductor, commeRbund etc), so you will need a way to access the numerical data.
You may be able to access the R packages through the rpy2 module. Another solution is to use R to write data files, then read these data files from python and use python's plotting packages.
dict.iteritems()
disappeared from python in python 3.0. Use dict.items()
in python 3, which returns a view (in python 2, dict.items()
used to return a list).
By default open() uses the ASCII encoding
According to the documentation, the default encoding is locale.getpreferredencoding()
. For me it is
>>> import locale
>>> locale.getpreferredencoding()
'UTF-8'
You can try to guess your file's encoding with the chardet module/cli utility.
Well, it is not very important to understand the section between ###...###
(the definition of the print_timing()
function). You can understand it later on, when you will study python's decorators. It is sufficient to know that if you write @print_timing
before a function definition, it will print the time it takes to run every time the function is called.
kevara()
is your own function, I suppose you understand it. kevara2()
is the same function written using an array of small integers instead of a list.
An array is a data structure similar to a list in many ways. For example, this code creates an array of 10 small unsigned integers:
>>> from array import array
>>> array('B', [1]) * 10
array('B', [1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
In kevara2()
, the array is filled with values 0 or 1 to indicate False or True. The 'B'
in the array creation tells python that the array items are small integers. A small integer takes 1 byte of memory (8 bits), while a python integer takes 28 bytes on my 64 bits computer. The use of getsizeof()
is an attempt to compute the memory size of the objects in bytes, although it is better to use Raymond Hettinger's total_size() method.
A shorter memory footprint could be achieved by using a bitarray structure from module bitarray (1 bit per boolean value), but it turns out to be slower.
A possibility is to use an array.array
instead of a list
. It is a little slower but it takes about 8 times less memory.
#!/usr/bin/env python3
# -*-coding: utf8-*-
'''Compares Sieves of Eratosthenes with list or array implementation
'''
from array import array
### Vegaseat code to time the function ###
# https://www.daniweb.com/software-development/python/code/486298/a-timing-decorator-python
import time
from functools import wraps
from sys import getsizeof
def print_timing(func):
'''
create a timing decorator function
use
@print_timing
just above the function you want to time
'''
@wraps(func) # improves debugging
def wrapper(*arg):
start = time.perf_counter() # needs python3.3 or higher
result = func(*arg)
end = time.perf_counter()
fs = '{} took {:.3f} microseconds'
print(fs.format(func.__name__, (end - start)*1000000))
return result
return wrapper
### end of timing code ###
@print_timing
def kevara(n):
marked = [False, False] + [True] * (n - 1)
for p in range(2, n + 1):
for i in range(p, int(n / p) + 1):
marked[p*i] = False
return marked
@print_timing
def kevara2(n):
marked = array('B', [1]) * (n+1)
marked[0], marked[1] = 0, 0
for p in range(2, n + 1):
for i in range(p, int(n / p) + 1):
marked[p*i] = 0
return marked
if __name__ == '__main__':
N = 1000000
result = kevara(N)
result2 = kevara2(N)
assert result == [bool(item) for item in result2]
print('{}\n{}'.format(getsizeof(result), getsizeof(result2)))
"""my output -->
kevara took 1545919.702 microseconds
kevara2 took 1694313.921 microseconds
8000072
1000065
"""
I have found 1 or 2 bugs in the last version. Here is a new version.
The code currently cannot be applied for multiline because it does not detect the beginning of the logical line of code, which means that in your example above, it will see the opcodes for line 2 but not for line 1. I need to add a mechanism to obtain the logical line.
I think I will upload a version in github soon, so that you can easily follow the changes in this code, the problem being that I don't have a lot of time to work on this now, so be patient...
EDIT: About obtaining the function name, you can get it from something similar to the printat()
snippet https://www.daniweb.com/software-development/python/code/479747/print-with-line-and-file-information . Look at chriswelborn's comment.
The first grain is missing in grains_sum
. The result is 2**65-1
, an odd number.
You can start with computing the term frequencies for each term and every document. There is a snippet by Vegaseat Click Here
Once you have these frequencies, wikipedia has various formulas.
It is a rather unusual first post. How could anybody have abused you as you joined 30 minutes ago ?
It seems very easy because the file is a TAB-separated file. Here is my code in python 2
#!/usr/bin/env python
# -*-coding: utf8-*-
'''doc
'''
from __future__ import (absolute_import, division,
print_function, unicode_literals)
import codecs
def process_file(filename):
with codecs.open(filename, encoding='utf8') as ifh:
for line in ifh:
row = line.split('\t')
english, hindi = row[-2:]
print('English:', english)
print('Hindi:', hindi)
if __name__ == '__main__':
process_file('hindmonocorp05.txt')
And the result