宜配屋

前言

Python包含6种内置的序列：列表、元组、字符串、Unicode字符串、buffer对象、xrange对象。在序列中的每个元素都有自己的编号。列表与元组的区别在于，列表是可以修改，而组元不可修改。理论上几乎所有情况下元组都可以用列表来代替。有个例外是但元组作为字典的键时，在这种情况下，因为键不可修改，所以就不能使用列表。

我们在一些统计工作或者分析过程中，有事会遇到要统计一个序列中出现最多次的元素，比如一段英文中，查询出现最多的词是什么，及每个词出现的次数。一遍的做法为，将每个此作为key，出现一次,value增加1。

例如：

morewords = ['why','are','you','not','looking','in','my','eyes']
for word in morewords:
 word_counts[word] += 1

collections.Counter 类就是专门为这类问题而设计的，它甚至有一个有用的 most_common() 方法直接给了你答案。

collections模块

collections模块自Python 2.4版本开始被引入，包含了dict、set、list、tuple以外的一些特殊的容器类型，分别是：

OrderedDict类：排序字典，是字典的子类。引入自2.7。
namedtuple()函数：命名元组，是一个工厂函数。引入自2.6。
Counter类：为hashable对象计数，是字典的子类。引入自2.7。
deque：双向队列。引入自2.4。
defaultdict：使用工厂函数创建字典，使不用考虑缺失的字典键。引入自2.5。

文档参见：http://docs.python.org/2/library/collections.html。

Counter类

Counter类的目的是用来跟踪值出现的次数。它是一个无序的容器类型，以字典的键值对形式存储，其中元素作为key，其计数作为value。计数值可以是任意的Interger（包括0和负数）。Counter类和其他语言的bags或multisets很相似。

为了演示，先假设你有一个单词列表并且想找出哪个单词出现频率最高。你可以这样做：

words = [
 'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
 'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',
 'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',
 'my', 'eyes', "you're", 'under'
]
from collections import Counter
word_counts = Counter(words)
# 出现频率最高的3个单词
top_three = word_counts.most_common(3)
print(top_three)
# Outputs [('eyes', 8), ('the', 5), ('look', 4)]

另外collections.Counter还有一个比较高级的功能，支持数学算术符的相加相减。

>>> a = Counter(words)
>>> b = Counter(morewords)
>>> a
Counter({'eyes': 8, 'the': 5, 'look': 4, 'into': 3, 'my': 3, 'around': 2,
"you're": 1, "don't": 1, 'under': 1, 'not': 1})
>>> b
Counter({'eyes': 1, 'looking': 1, 'are': 1, 'in': 1, 'not': 1, 'you': 1,
'my': 1, 'why': 1})
>>> # Combine counts
>>> c = a + b
>>> c
Counter({'eyes': 9, 'the': 5, 'look': 4, 'my': 4, 'into': 3, 'not': 2,
'around': 2, "you're": 1, "don't": 1, 'in': 1, 'why': 1,
'looking': 1, 'are': 1, 'under': 1, 'you': 1})
>>> # Subtract counts
>>> d = a - b
>>> d
Counter({'eyes': 7, 'the': 5, 'look': 4, 'into': 3, 'my': 2, 'around': 2,
"you're": 1, "don't": 1, 'under': 1})
>>>

参考文档：

https://docs.python.org/3/library/collections.html

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，如果有疑问大家可以留言交流，谢谢大家对【听图阁-专注于Python设计】的支持。

利用Python找出序列中出现最多的元素示例代码

相关文章

简单介绍Ruby中的CGI编程

python编写朴素贝叶斯用于文本分类

分享几道你可能遇到的python面试题

python实现websocket的客户端压力测试

django的聚合函数和aggregate、annotate方法使用详解

© YiPeiWu.com 【宜配屋】粤ICP备17031333号

Powered By Z-BlogPHP. Theme by TOYEAN.

宜配屋