How do I extract whole numbers from a text file, put into a list, sort, get length and median in python?

60 views Asked by At

I have a desk top file (paragraphs of words, numbers and punctuation). I need to get just the numbers, sort them, print sorted list, length and median. I can open, read and extract just the numbers but all numbers (even my double digits) print on "\n". I am allowed to use import statistics.

I have tried several attempts. I know I need a .split(),.strip() and .append. I just cannot figure out where.

from statistics import median
def sorting_file(): 
    with open('C:\\Users\\wildk\\Desktop\\Numbers_in_text.txt', 'r') as f:
        for line in f:
            words = line.split().strip('n')
            for i in words:
                for number in i:
                     number = number.rstrip('n')
                    if(number.isnumeric()):
                         print(number)
def get_median():    
     for median in number:
        number = sorted
        median = sorted (number) [len(number)//2]
        print(median, len )                 

if __name__=='__main__':
     sorting_file()
     get_median()

enter image description here

1

There are 1 answers

0
trincot On

The get_median function has several problems:

  • It references number, but that is an undefined name.
  • During iteration you redefined number as sorted, which is a function. (Nothing gets sorted here)
  • sorted(number) is then the same as sorted(sorted) which makes no sense.
  • If that sorted call would work, you could not index it, as it is an iterator, not a list.
  • print(number, len) would print two functions. I don't see how that is useful.
  • Moreover, you have included a median function from statistics so you don't need get_median at all.

Then sorting_file has problems too:

  • It calls .strip('n') on a list. Lists don't have such a method, and it is a mystery why you would want something to happen with the letter n anyway
  • for number in i is iterating the individual characters of a word (i). This is not useful. The word itself is what you are interested in.
  • .rstrip('n') seems to serve no purpose
  • The function only intends to print numbers, but doesn't actually convert them to numbers, nor collect them in some list to use for getting a median

Here is a correction that takes the above points into account:

from statistics import median

def sorting_file(): 
    with open('text.txt', 'r') as f:
        for line in f:
            return [int(i) for i in line.split() if i.isnumeric()]

res = median(sorting_file())
print(res)