python统计文本文件内单词数量的方法

yipeiwu_com5年前Python基础

本文实例讲述了python统计文本文件内单词数量的方法。分享给大家供大家参考。具体实现方法如下:

# count lines, sentences, and words of a text file
# set all the counters to zero
lines, blanklines, sentences, words = 0, 0, 0, 0
print '-' * 50
try:
 # use a text file you have, or google for this one ...
 filename = 'GettysburgAddress.txt'
 textf = open(filename, 'r')
except IOError:
 print 'Cannot open file %s for reading' % filename
 import sys
 sys.exit(0)
# reads one line at a time
for line in textf:
 print line,  # test
 lines += 1
 if line.startswith('\n'):
  blanklines += 1
 else:
  # assume that each sentence ends with . or ! or ?
  # so simply count these characters
  sentences += line.count('.') + line.count('!') + line.count('?')
  # create a list of words
  # use None to split at any whitespace regardless of length
  # so for instance double space counts as one space
  tempwords = line.split(None)
  print tempwords # test
  # word total count
  words += len(tempwords)
textf.close()
print '-' * 50
print "Lines   : ", lines
print "Blank lines: ", blanklines
print "Sentences : ", sentences
print "Words   : ", words
# optional console wait for keypress
from msvcrt import getch
getch()

希望本文所述对大家的python程序设计有所帮助。

相关文章

python使用celery实现异步任务执行的例子

使用celery在django项目中实现异步发送短信 在项目的目录下创建celery_tasks用于保存celery异步任务。 在celery_tasks目录下创建config.py文件...

Python正则表达式匹配日期与时间的方法

下面给大家介绍下Python正则表达式匹配日期与时间 #!/usr/bin/env python # -*- coding: utf-8 -*- __author__ = 'Ran...

python中异常捕获方法详解

在Python中处理异常使用的是try-except代码块,try-except代码块放入让python执行的操作,同时告诉python程序如果发生了异常该怎么办,try-except这...

python2.7和NLTK安装详细教程

本文为大家分享了python2.7和NLTK安装教程,具体内容如下 系统:Windows 7 Ultimate 64-bits Python 2.7安装 下载Python 2.7:官网下...

TF-IDF算法解析与Python实现方法详解

TF-IDF算法解析与Python实现方法详解

TF-IDF(term frequency–inverse document frequency)是一种用于信息检索(information retrieval)与文本挖掘(text m...