Gribouillis 1,391 Programming Explorer Team Colleague

You can also use

while playermove not in ('paper', 'scissors', 'rock'):
    ...
Louis_2 commented: I might use this, if would probably be more efficient, thank-you +0
Gribouillis 1,391 Programming Explorer Team Colleague

I'm using IDLE

Idle has 2 processes, the first one (say process A) to run the IDLE GUI, the second one (say process B) to run your python code. Restarting the shell in idle means to kill process B to start a new one.

This snippet only restarts the process where it is called, in our case, it restarts process B, but it won't restart Idle. As far as I know, there is no way to restart process A from your program because your python program has no notion that it is running in Idle.

The Idle process could be restarted programmatically by exploring the processes running on your computer and calling appropriate commands, but I don't think it would be very useful. Why do you want to restart Idle ?

Edit: actually, in idle it won't work well because the idle process A starts your process B with pipes to communicate with B, so that you can read data printed by your program in the idle console or send data to your program by writing in the idle console. Unfortunately, the pipes wont be enabled after restart_program(), which means that your program restarts but can't communicate with the idle console. I'll try to design a small test program to show this.

Louis_2 commented: Thanks for that, I'm now using a while loop instead. +0
Gribouillis 1,391 Programming Explorer Team Colleague

The signature of the bytes() function gives the solution

class bytes(object)
 |  bytes(iterable_of_ints) -> bytes
 |  bytes(string, encoding[, errors]) -> bytes
 |  bytes(bytes_or_buffer) -> immutable copy of bytes_or_buffer
 |  bytes(int) -> bytes object of size given by the parameter initialized with null bytes
 |  bytes() -> empty bytes object
...

You want to use the second form bytes(string, encoding) but instead you wrote bytes((string, encoding)), that is to say a single tuple argument. Since a tuple is iterable, the only option for python is to believe that you used the first form. Then python expects that your tuple is a tuple of ints, and as this fails, you get the error. So use

bytes(msg, 'utf-8')
ddanbe commented: Great knowledge. +15
Gribouillis 1,391 Programming Explorer Team Colleague

Do you really mean a distributed denial of service attack ? Where would it be funny or cool ?

Gribouillis 1,391 Programming Explorer Team Colleague

Great ! Then mark the thread as solved !

Gribouillis 1,391 Programming Explorer Team Colleague

The flatten_dict() function returns an iterable sequence of (key, value) pairs. You can turn this to a list of pairs with list(flatten_dict(root)).

A list of pairs L (or a iterable of pairs) can be transposed with zip(*L).

>>> L = [('A', 1), ('B', 2), ('C', 3), ('D', 4)]
>>> list(zip(*L))
[('A', 'B', 'C', 'D'), (1, 2, 3, 4)]
Gribouillis 1,391 Programming Explorer Team Colleague

You don't understand the output of flatten_dict(). You can do this

def main():
    with open('source.xml', 'r', encoding='utf-8') as f: 
        xml_string = f.read() 
    xml_string= xml_string.replace('�', '') #optional to remove ampersands. 
    root = ElementTree.XML(xml_string) 

    writer = csv.writer(open("test_out.csv", 'wt'))
    writer.writerows(zip(*flatten_dict(root)))

if __name__ == "__main__":
        main()

Also main() was not called in your code, due to indentation.

Gribouillis 1,391 Programming Explorer Team Colleague

If there is still a dot at the end of a column header, it would be better to remove it ('Response.R.' becomes 'Response.R'). For this, use the rstrip('.') method.

By default, the csv module selects the 'excel' dialect.

It is not better to use writerows(). It is only shorter if you only want to call writerow() several times. Shorter code looks better.

Gribouillis 1,391 Programming Explorer Team Colleague

The problem is that I dont understand your rule for key generation. If you want to keep only the last word, it is very easy to do

    for key, value in flatten_dict(root):
        key = key.rstrip('.').rsplit('.', 1)[-1]
        print(key,  value)

edit: also, you can start with my generator and change the code the way you want to generate a different key.

Gribouillis 1,391 Programming Explorer Team Colleague

It looks easy if you write pseudo-code

def fileExtensionExists(fileList, fileExtension):
    for fileName in fileList:
        get this filename''s extension, call this 'extension' (use parseExtension)
        if extension == fileExtension:
            return True
    # if we reach here, the extension was not met
    # it means that the extension does not exist in the list
    return False

def parseExtension(fileName):
    if there is a dot in filename:
        return everything after the last dot
    else:
        return the empty string

Now you should be able to transform this into python.

Gribouillis 1,391 Programming Explorer Team Colleague

Here is a variant which handles lists differently by enumerating the list items and removing their common tag. It works for MonthDayCount in your example, but I'm not sure it will work the way you want for all the lists in your files.

import xml.etree.cElementTree as ElementTree 
from xml.etree.ElementTree import XMLParser

def flatten_list(aList, prefix=''):
    for i, element in enumerate(aList, 1):
        eprefix = "{}{}".format(prefix, i)
        if element:
            # treat like dict 
            if len(element) == 1 or element[0].tag != element[1].tag: 
                yield from flatten_dict(element, eprefix+'.')
            # treat like list 
            elif element[0].tag == element[1].tag: 
                yield from flatten_list(element, eprefix+'.')
        elif element.text: 
            text = element.text.strip() 
            if text: 
                yield eprefix, text


def flatten_dict(parent_element, prefix=''):
    prefix = prefix + parent_element.tag + '.'
    if parent_element.items():
        for k, v in parent_element.items():
            yield prefix + k, v
    for element in parent_element:
        eprefix = prefix + element.tag + '.'
        if element:
            # treat like dict - we assume that if the first two tags 
            # in a series are different, then they are all different. 
            if len(element) == 1 or element[0].tag != element[1].tag: 
                yield from flatten_dict(element, prefix=prefix)
            # treat like list - we assume that if the first two tags 
            # in a series are the same, then the rest are the same. 
            else: 
                # here, we put the list in dictionary; the key is the 
                # tag name the list elements all share in common, and 
                # the value is the list itself
                yield from flatten_list(element, prefix=eprefix)
            # if the tag has attributes, add those to the dict
            if element.items(): …
Gribouillis 1,391 Programming Explorer Team Colleague

I transformed the activestate recipe into a generator which flattens the xml structure. Look at the ouput, then try to define how you would transform the keys to get correct column names

import xml.etree.cElementTree as ElementTree 
from xml.etree.ElementTree import XMLParser 
import json 
import csv 
import tokenize 
import token 
try: 
    from collections import OrderedDict 
    import json 
except ImportError: 
    from ordereddict import OrderedDict 
    import simplejson as json 
import itertools 
import six 
import string 
#from csvkit import CSVKitWriter 


def flatten_list(aList, prefix=''): 
    for element in aList:
        if element: 
            # treat like dict 
            if len(element) == 1 or element[0].tag != element[1].tag: 
                yield from flatten_dict(element, prefix)
            # treat like list 
            elif element[0].tag == element[1].tag: 
                yield from flatten_list(element, prefix)
        elif element.text: 
            text = element.text.strip() 
            if text: 
                yield prefix, text


def flatten_dict(parent_element, prefix=''):
    prefix = prefix + parent_element.tag + '.'
    if parent_element.items():
        for k, v in parent_element.items():
            yield prefix + k, v
    for element in parent_element:
        eprefix = prefix + element.tag + '.'
        if element:
            # treat like dict - we assume that if the first two tags 
            # in a series are different, then they are all different. 
            if len(element) == 1 or element[0].tag != element[1].tag: 
                yield from flatten_dict(element, prefix=prefix)
            # treat like list - we assume that if the first two tags 
            # in a series are the same, then the rest are the same. 
            else: 
                # here, we put the list in dictionary; the key is the 
                # tag name the list elements all share in common, and 
                # the value is the list …
Gribouillis 1,391 Programming Explorer Team Colleague

I see that you are using an old activestate recipe to parse xml and transform it into a dictionary. It means that the comments in this code don't have anything to do with your specific xml files.

You could traverse the parsed xml tree directly and generate on the fly the items of your final flat dictionary. It would be much more hard-hitting. This would probably mean two or three generators invoking each other recursively through the use of yield from statements.

The key algorithmic points remain the same: what is your actual xml file's structure and what are your rules to create target key, value pairs ?

Gribouillis 1,391 Programming Explorer Team Colleague

Yes put the snippet in a file named postprocess.py then write

from postprocess import post_process
Gribouillis 1,391 Programming Explorer Team Colleague

Here is an example of what you can do. This function transforms a dictionary by exploding the inner lists if they contain only strings

from postprocess import post_process

@post_process(dict)
def explode_lists(adict):
    for key, value in adict.items():
        if isinstance(value, list):
            if all(isinstance(x, str) for x in value):
                for i, x in enumerate(value, 1):
                    yield ('{}{}'.format(key, i), x)
                continue
        yield key, value

if __name__ == '__main__':
    D = {'D_B': ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'],
        'F_Int32': ['0','0','0','0'],
        'OTF': '0',
        'PBDS_Double': ['0', '0', '0', '0', '0', '0', '0', '0'],
        'SCS_String': ['1', '2']}
    print(explode_lists(D))


""" my output -->
{'SCS_String2': '2', 'SCS_String1': '1', 'PBDS_Double1': '0', 'PBDS_Double3': '0', 'PBDS_Double2': '0', 'PBDS_Double5': '0', 'PBDS_Double4': '0', 'PBDS_Double7': '0', 'PBDS_Double6': '0', 'PBDS_Double8': '0', 'F_Int321': '0', 'F_Int323': '0', 'F_Int322': '0', 'F_Int324': '0', 'D_B5': '0', 'D_B4': '0', 'D_B7': '0', 'D_B6': '0', 'D_B1': '0', 'D_B3': '0', 'D_B2': '0', 'D_B9': '0', 'D_B8': '0', 'OTF': '0', 'D_B11': '0', 'D_B10': '0'}
"""

In this code, I used a very useful snippet that I wrote long ago in the following file postprocess.py

# postprocess.py

def post_process(*filters):
    """Decorator to post process a function's return value through a
    sequence of filters (functions with a single argument).

    Example:

        @post_process(f1, f2, f3)
        def f(*args, **kwd):
            ...
            return value

        then calling f(...) will actually return f3( f2( f1( f(...)))).

        This can also be used to convert a generator to a function
        returning a sequence type:

        @post_process(dict)
        def my_generator():
            ...
            yield key, value

    """

    def decorate(func):
        from functools import wraps
        @wraps(func)
        def wrapper(*args, **kwd): …
Gribouillis 1,391 Programming Explorer Team Colleague

Here is a simple transformation snippet

>>> key = 'spam'
>>> L = ['foo', 'bar', 'baz']
>>> [('{}{}'.format(key, i), value) for i, value in enumerate(L, 1)]
[('spam1', 'foo'), ('spam2', 'bar'), ('spam3', 'baz')]
Gribouillis 1,391 Programming Explorer Team Colleague

the key is replicated n number of times for n number of items in its associated list

Every output is possible, only the rules which govern the production of items must be very carefuly defined. If you want DB1, DB2, etc, you must explain by which rule D_B becomes DB1, DB2 etc. In the same way, which rule transforms F_Int32 into F1, F2, etc.

Python can implement any precise rule that you define, but it cannot define the transformation rules for you.

Gribouillis 1,391 Programming Explorer Team Colleague

Interestingly, with your function on other files I get the following error

I made the assumption that if the json file contains arrays with more than one item, all these items are strings. It fails in the files that you tried. Only strings can be concatenated. You must define the list of pairs that must be generated in these cases.

One final comment to make: Is it possible to write a function that applies to each key such that if the key is associated with a value, the key is replicated n number of times for n number of items in its associated list.

Again, this must be clarified with a json example and the complete list of pairs key/value that you want to produce with this json data.

Gribouillis 1,391 Programming Explorer Team Colleague

Here is the kind of code that you can try

import csv 
import json 
import sys 

def shrink(v):
    while True:
        if isinstance(v, list):
            if len(v) == 1:
                v = v[0]
            elif v:
                assert all(isinstance(x, str) for x in v)
                v = ''.join(v)
            else:
                raise ValueError('Empty list')
        elif isinstance(v, dict) and len(v) == 1:
            v = next(iter(v.values()))
        else:
            return v

def flatten(obj):
    assert isinstance(obj, dict)
    for k, v in obj.items(): 
        v = shrink(v)
        if isinstance(v, dict):
            yield from flatten(v)
        else:
            yield k, v

if __name__ == "__main__": 
    with open("data2.json") as f: 
        data = json.load(f) 

    pairs = list(flatten(data))
    print(pairs)

    writer = csv.writer(sys.stdout) 
    header = writer.writerow([k for k, v in pairs]) 
    row = writer.writerow([v for k, v in pairs])

The idea is to shrink the values with the following rules: if a value is a list with a single element, it is replaced by this element (shrunk). If the value is a list with more than 1 element, it is assumed that all the elements are strings, and they are concatenated and the result replaces the value. If the value is a dict with a single key, it is replaced by the value of this single item.

My output with the modified json file above is

[('DLA', '0'), ('FC', '00000'), ('PC', '0'), ('WC', '0'), ('CN', None), ('Description', None), ('Code', '0'), ('CMC', '0')]
DLA,FC,PC,WC,CN,Description,Code,CMC
0,00000,0,0,,,0,0
Gribouillis 1,391 Programming Explorer Team Colleague

It is because temp is a string (str type) while 85 is an integer. Integer comparison is different from string comparison (string ordering is lexicographic order). You should do

temp = int('32')

to convert to int for example.

This problem does not exist in recent versions of python where comparison of unordered data types is impossible. See the difference between python 2.7

>>> '32' > 85
True

and python 3.4

>>> '32' > 85
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() > int()

If you have the choice, learn python with python 3.4.

Gribouillis 1,391 Programming Explorer Team Colleague

Of course you must do a bit of parsing, for example here is how to generate the pairs in the above example (the code must be modified if the json file contains lists)

#!/usr/bin/env python3
# -*-coding: utf8-*-
'''demonstrates incremental json parsing
'''

import ijson
import io

values = set(['null', 'boolean', 'number', 'string'])

class UnexpectedSyntax(RuntimeError):
    pass

def gen_pairs(jsonfile):
    parser = ijson.parse(jsonfile)
    p, e, v = next(parser)
    if e != 'start_map':
        raise UnexpectedSyntax
    map_depth = 1
    for p, e, v in parser:
        if e == 'end_map':
            map_depth -= 1
            if map_depth == 0:
                next(parser)
                raise UnexpectedSyntax('Expected end of json source after map')
        elif e == 'map_key':
            key = v
            p, e, v = next(parser)
            if e in values:
                yield key, v
            elif e == 'start_map':
                map_depth += 1
            else:
                raise UnexpectedSyntax
        else:
            raise UnexpectedSyntax
    if map_depth > 0:
        raise UnexpectedSyntax('Incomplete map in json source')

ifh = io.open('data.json', encoding='utf8')
for key, value in gen_pairs(ifh):
    print(key, value)

""" my output -->
A 5
FEC 1/1/0001 12:00:00 AM
TE None
Locator None
Message Transfer Fee
AT None
FT None
FR True
FY None
FR None
FG 0
Comment None
FUD None
cID None
GEO None
bar baz
qux spam
ISO None
TRID None
XTY 931083
ANM None
NM None
CF Fee
ID 2
"""
Slavi commented: awesome +6
Gribouillis 1,391 Programming Explorer Team Colleague

@slavi
Yes, you need an incremental json parser. By googling a few minutes, I found ijson and its mother yajl-py. There may be others.

@saran_1
Using an incremental json parser, you could parse the file twice and write the first row, then the second row, even if the file is very large. Alternatively, you could write one file per row in one pass, then concatenate these files.

Slavi commented: Thanks I will look into in =] +6
Gribouillis 1,391 Programming Explorer Team Colleague

Once the text item is created, you can obtain a bounding box with canvas.bbox(item=text), or something similar.

Gribouillis 1,391 Programming Explorer Team Colleague

Hm, you want to understand some basic things in programming, but the task of converting markdown to html is not such a basic thing. I would recommend it only to experienced programmers.

Such modules already exist: you can install the markdown module, then use its conversion function described here

import codecs
import markdown
input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8")
text = input_file.read()
html = markdown.markdown(text)

You could perhaps look in the source code of the markdown module to see how hard it is.

Irene26 commented: can you send me link with source code of the markdown module, i'll be very thankful +0
Gribouillis 1,391 Programming Explorer Team Colleague

These commands remain dangerous even after they've been used, because they remain in your shell history, with a risk of being recalled mistakenly, for example a ^r in a bash terminal may easily recall commands. I wouldn't accept to have a rm -rdf * in my shell history.

cereal commented: +1 +13
Gribouillis 1,391 Programming Explorer Team Colleague

I would say rm -rf *.

Gribouillis 1,391 Programming Explorer Team Colleague

The code finds the lines containing the word 'python' in the file named 'history'. The search function yields the line containing that word and a deque of at most five lines immediately before that line.
For example if the file contains

foo
bar
baz
qux
spam
eggs
ham
my python
hello

the search function will yield

('my python\n', deque(['baz\n','qux\n','spam\n','eggs\n','ham\n']))
Gribouillis 1,391 Programming Explorer Team Colleague

All right, then create a file foo.py with the following code

import datetime
getTime1 = "01:00PM"
t = datetime.datetime.strptime(getTime1, "%I:%M%p") + datetime.timedelta(days=36524)
time_half_hour_before = (t - datetime.timedelta(minutes=30)).strftime("%I:%M%p")
print time_half_hour_before

Then run it and post here everything that python says.

Edit: python 2.6 is already much better than python 2.4, so if you can upgrade, upgrade.

Gribouillis 1,391 Programming Explorer Team Colleague

Unfortunately v is int or long means

(v is int) or long

It is always True because bool(long) is True. If you want to test if a value is an int or a long, the valid ways are

isinstance(v, (int, long)) # good programming
type(v) in (int, long) # doesn't work with subtypes
v.__class__ in (int, long) # dirty but faster
Tcll commented: oh ok, my mistake :) +4
Gribouillis 1,391 Programming Explorer Team Colleague

Here is a way. I replaced all the print statements by calls to a function printout(). This function is built by using the print_function feature which makes python 2 look like python 3. Note that you can still use print in other parts of the code but with a function syntax.

The from __future__ statement must appear at the very top of the program.

from __future__ import print_function
from functools import partial

def main2(ofile):

    printout = partial(print, file=ofile)

    with open ("LotData1.txt", "r") as file:
        sideList = []
        for i in file:
            temp = i.strip().split()
            sideList.append([temp[0], temp[1], float(temp[2])])



    obj = Lot(sideList, "", "")
    Departures = obj.getDepartureClosure()
    Latitudes = obj.getLatitudeClosure()
    Departures_Correction = obj.getDepartureCorrectedClosure()
    Latitudes_Correction = obj.getLatitudeCorrectedClosure()




    count = len(sideList)

    printout("INPUT VALUES:\nCOURSE\tBEARING\t\tDISTANCE\t\tLATITUDE\tCORRECTION\tDEPARTURE\tCORRECTION")
    for i in range(count):
        printout("%s \t  %s \t %+6.3f\t%+10.3f\t%+10.3f\t%+10.3f\t%+10.3f"%\
        (sideList[i][0], sideList[i][1], sideList[i][2], obj.getLatitudeClosure()[i], Latitudes_Correction[i], Departures[i], Departures_Correction[i]))


    AdjustedLatitudes,  AdjustedDepartures, Adjusted_Distances = obj.adjustByCompassRule()
    AdjustedBearings = obj.adjustBearings()
    printout("\n\nADJUSTED VALUES:\nCOURSE\tLATITUDE\tDEPARTURE\tDISTANCE\tBEARING")
    for i in range(count):

        printout("%s\t%+10.3f\t%+10.3f\t%+10.3f\t%5d-%d-%d"%\
        (sideList[i][0], AdjustedLatitudes[i], AdjustedDepartures[i], Adjusted_Distances[i],AdjustedBearings[0][i],AdjustedBearings[1][i],AdjustedBearings[2][i]))


    Northings = obj.GetNorthings()
    Eastings = obj.GetEastings()
    printout("\n\nADJUSTED COORDINATES:\nCORNER\tNORTHING\tEASTING")
    for i in range(count):
        printout("%s\t%10.3f\t%10.3f" %\
        (sideList[i][0][0:1],Northings[i-1],Eastings[i-1]))


    LEC = obj.getLinearErrorOfClosure()
    TotalDistance = obj.getTotalDistance()
    Departure_Closure = sum(Departures)
    Latitude_Closure = sum(Latitudes)
    RelativePrecision = obj.getRelativePrecision()
    LECBearing = obj.getLECBearingToString()
    CorrLatClosure = obj.getLatCorrectedClosure()
    CorrDepClosure = obj.getDepCorrectedClosure()



    printout()
    printout("OTHER STATISTICS:")
    printout(" LATITUDE CLOSURE:", Latitude_Closure)

    printout(" DEPARTURE CLOSURE:", Departure_Closure)
    printout()
    printout(" LINEAR ERROR OF CLOSURE (LEC):", LEC)
    printout(" LEC BEARING: ", LECBearing)
    printout()
    printout(" TOTAL DISTANCE:", TotalDistance)

    printout(" RELATIVE PRECISION: 1:", RelativePrecision)

    printout(" CORRECTED LATITUDE CLOSURE:", CorrLatClosure)
    printout(" CORRECTED DEPARTURE CLOSURE", CorrDepClosure)

def main():
    with open('outputfile.txt', 'w') as ofile:
        main2(ofile)

Bonus: if you …

Gribouillis 1,391 Programming Explorer Team Colleague

If you input [z * 5 for z in range(2, 10, 2)], it does not interfere.

Gribouillis 1,391 Programming Explorer Team Colleague

I think it works only with recent versions of python. For me, it works with python 3.4 but not with 2.7.

Edit: you can write

for rec in records:
    tag, args = rec[0], rec[1:]
Slavi commented: awesome :D +6
Gribouillis 1,391 Programming Explorer Team Colleague

decorators only work on function call, not on definition

On the contrary, the decorator is called once, immediately after the function's definition

>>> def deco(f):
...  print f.__name__, 'is being defined'
...  return f
... 
>>> @deco
... def foo(x):
...  return x * x
... 
foo is being defined
>>> foo(3)
9
>>> 
Gribouillis 1,391 Programming Explorer Team Colleague

Between the raw_input() and your printing x, there was an input(). This is where x was modified.

About the dangers of input(), the point is that when a program writes write a list!, the user does not think that his input can have a malicious effect such as erasing files or starting unwanted programs, etc. There should be some mechanism in your program to prevent this from happening. Python's eval() function is too powerful, or it must be used in conjonction with code restricting the string that can be passed to input().

An example is given in this snippet where an expression is carefully analysed before being given to eval(), thus allowing only the evaluation of mathematical expressions.

Gribouillis 1,391 Programming Explorer Team Colleague

When [x * 3 for x in range(2, 10, 2)] is evaluated as a result of the input function, the variable x takes the values 2, 4, 6, 8. When the x is printed, its value is the last one, 8. You can avoid this by using another variable name.

Conclusion: don't use input() in python 2, it is too dangerous.

Gribouillis 1,391 Programming Explorer Team Colleague

Here is my solution

def field(template, value):
    res = 0
    for t, v in zip(template, value)[::-1]:
        if t.__class__ is int:
            t = t.bit_length()
        else:
            t = int(t)
        res = (res << t)|v
    return res

if __name__ == '__main__':
    res = field([0b1,0b111,0b1111], [0,3,9])
    print(res, bin(res))
Tcll commented: very nice +4
Gribouillis 1,391 Programming Explorer Team Colleague

I'm afraid the indention didn't make it through the web editor. The best way to indent python code is with 4 space characters such as in

if x == 0:
    print("hello") # <- see the 4 space at the beginning ?

It means that you must configure your editor so that it inserts 4 space characters when you hit the tab key. Which code editor are you using ? Look for the preference menu.

About your existing code, you can reindent it correctly by installing Tim Peters' reindent module. After installing reindent (pip install reindent in a console window may work), you can type for example

reindent Secante.py

in a console window, and this will indent the Secante.py file with spaces.

DO NOT indent python code with tab characters '\t' if you want to avoid indention issues.

Gribouillis 1,391 Programming Explorer Team Colleague

All right. It seems strange to me that the bot2 folder is in C:\Python34 . You shouldn't clutter python's own directory with your modules under development.

Gribouillis 1,391 Programming Explorer Team Colleague

You can try something such as

p = cfg.project

command = (
    "{cuffdiff} -p {threads} -b {fasta} -u {merged} "
    "-L {pheno0},{pheno1} -o {ofolder} {bam0} {bam1} {log}"
    ).format(
    cuffdiff=cfg.tool_cmd("cuffdiff"),
    threads=p["analysis"]["threads"],
    fasta=p["genome"]["fasta"],
    merged=p["experiment"]["merged"],
    pheno0=p["phenotype"][0],
    pheno1=p["phenotype"][1],
    ofolder=output_folder,
    bam0=p["samples"][0]["files"]["bam"],
    bam1=p["samples"][1]["files"]["bam"],
    log=p["analysis"]["log_file"]
)

print(command)

with your corrections. (for example, you may try to replace p["phenotype"][0] with next(iter(p["phenotype"][0]))).

Gribouillis 1,391 Programming Explorer Team Colleague

touch foo creates file foo, if that's what you mean, but I probably missed something in your question. (?)

Gribouillis 1,391 Programming Explorer Team Colleague

The problem is not with the plotting, there are very good libraries for plotting such as matplotlib or pyqtgraph or perhaps the gr framework, etc. The problem is rather that you're obtaining your numerical data from a R library (bioconductor, commeRbund etc), so you will need a way to access the numerical data.

You may be able to access the R packages through the rpy2 module. Another solution is to use R to write data files, then read these data files from python and use python's plotting packages.

Gribouillis 1,391 Programming Explorer Team Colleague

dict.iteritems() disappeared from python in python 3.0. Use dict.items() in python 3, which returns a view (in python 2, dict.items() used to return a list).

Gribouillis 1,391 Programming Explorer Team Colleague

By default open() uses the ASCII encoding

According to the documentation, the default encoding is locale.getpreferredencoding(). For me it is

>>> import locale
>>> locale.getpreferredencoding()
'UTF-8'

You can try to guess your file's encoding with the chardet module/cli utility.

Gribouillis 1,391 Programming Explorer Team Colleague

Well, it is not very important to understand the section between ###...### (the definition of the print_timing() function). You can understand it later on, when you will study python's decorators. It is sufficient to know that if you write @print_timing before a function definition, it will print the time it takes to run every time the function is called.

kevara() is your own function, I suppose you understand it. kevara2() is the same function written using an array of small integers instead of a list.

An array is a data structure similar to a list in many ways. For example, this code creates an array of 10 small unsigned integers:

>>> from array import array
>>> array('B', [1]) * 10
array('B', [1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In kevara2(), the array is filled with values 0 or 1 to indicate False or True. The 'B' in the array creation tells python that the array items are small integers. A small integer takes 1 byte of memory (8 bits), while a python integer takes 28 bytes on my 64 bits computer. The use of getsizeof() is an attempt to compute the memory size of the objects in bytes, although it is better to use Raymond Hettinger's total_size() method.

A shorter memory footprint could be achieved by using a bitarray structure from module bitarray (1 bit per boolean value), but it turns out to be slower.

Gribouillis 1,391 Programming Explorer Team Colleague

A possibility is to use an array.array instead of a list. It is a little slower but it takes about 8 times less memory.

#!/usr/bin/env python3
# -*-coding: utf8-*-
'''Compares Sieves of Eratosthenes with list or array implementation
'''

from array import array

### Vegaseat code to time the function ###
# https://www.daniweb.com/software-development/python/code/486298/a-timing-decorator-python
import time
from functools import wraps
from sys import getsizeof

def print_timing(func):
    '''
    create a timing decorator function
    use
    @print_timing
    just above the function you want to time
    '''
    @wraps(func)  # improves debugging
    def wrapper(*arg):
        start = time.perf_counter()  # needs python3.3 or higher
        result = func(*arg)
        end = time.perf_counter()
        fs = '{} took {:.3f} microseconds'
        print(fs.format(func.__name__, (end - start)*1000000))
        return result
    return wrapper
### end of timing code ###

@print_timing
def kevara(n):
    marked = [False, False] + [True] * (n - 1)
    for p in range(2, n + 1):
        for i in range(p, int(n / p) + 1):
            marked[p*i] = False
    return marked

@print_timing
def kevara2(n):
    marked = array('B', [1]) * (n+1)
    marked[0], marked[1] = 0, 0
    for p in range(2, n + 1):
        for i in range(p, int(n / p) + 1):
            marked[p*i] = 0
    return marked

if __name__ == '__main__':
    N = 1000000
    result = kevara(N)
    result2 = kevara2(N)
    assert result == [bool(item) for item in result2]
    print('{}\n{}'.format(getsizeof(result), getsizeof(result2)))

"""my output -->

kevara took 1545919.702 microseconds
kevara2 took 1694313.921 microseconds
8000072
1000065
"""
Gribouillis 1,391 Programming Explorer Team Colleague

I have found 1 or 2 bugs in the last version. Here is a new version.

The code currently cannot be applied for multiline because it does not detect the beginning of the logical line of code, which means that in your example above, it will see the opcodes for line 2 but not for line 1. I need to add a mechanism to obtain the logical line.

I think I will upload a version in github soon, so that you can easily follow the changes in this code, the problem being that I don't have a lot of time to work on this now, so be patient...

EDIT: About obtaining the function name, you can get it from something similar to the printat() snippet https://www.daniweb.com/software-development/python/code/479747/print-with-line-and-file-information . Look at chriswelborn's comment.

Tcll commented: your welcome ;) +4
Gribouillis 1,391 Programming Explorer Team Colleague

The first grain is missing in grains_sum. The result is 2**65-1, an odd number.

Gribouillis 1,391 Programming Explorer Team Colleague

You can start with computing the term frequencies for each term and every document. There is a snippet by Vegaseat Click Here

Once you have these frequencies, wikipedia has various formulas.

Gribouillis 1,391 Programming Explorer Team Colleague

It is a rather unusual first post. How could anybody have abused you as you joined 30 minutes ago ?

Gribouillis 1,391 Programming Explorer Team Colleague

It seems very easy because the file is a TAB-separated file. Here is my code in python 2

#!/usr/bin/env python
# -*-coding: utf8-*-
'''doc
'''
from __future__ import (absolute_import, division,
                        print_function, unicode_literals)
import codecs

def process_file(filename):
    with codecs.open(filename, encoding='utf8') as ifh:
        for line in ifh:
            row = line.split('\t')
            english, hindi = row[-2:]
            print('English:', english)
            print('Hindi:', hindi)

if __name__ == '__main__':
    process_file('hindmonocorp05.txt')

And the result