Feng erdong's Blog

Life is beautiful

Python Built-in Functions

| Comments

  • cmp(x, y)

Attribute

  • delattr(object, name)
    delattr(x, 'foobar') is equivalent to del x.foobar.
  • getattr(object, name[, default])
  • setattr(object, name, value)
  • hasattr(object, name)
  • repr(object)
    This is the same value yielded by conversions (reverse quotes).

Type

  • basestring
    isinstance(obj, *basestring*) is equivalent to isinstance(obj, (str, unicode)).
  • isinstance(object, classinfo)
  • issubclass(class, classinfo)
  • super(type[, object-or-type])
1
2
3
class C(B):
    def method(self, arg):
        super(C, self).method(arg)
  • type(object) type(name, bases, dict)
    With one argument, return the type of an object.
    With three arguments, return a new type object.
1
2
3
4
# the following two statements create identical type objects
class X(object):
    a = 1
X = type('X', (object,), dict(a=1))

Type Cast

  • unichr(i)
  • chr(i)
    Return a string of one character whose ASCII code is the integer
1
2
print chr(97)
>>'a'
  • ord(c)
  • int(x)
  • hex(x)
  • float(x)
  • long(x)
  • oct(x)

Collection

  • tuple([iterable])
    Return a tuple whose items are the same and in the same order as iterable‘s items.
    tuple is an immutable sequence type.
1
2
3
list1 = ['edfeng', 'tyzhang', 'zming']
tuple(list1)
>>('edfeng', 'tyzhang', 'zming')
  • range(stop) range(start, stop[, step])
    If the start argument is omitted, it defaults to 0.
  • enumerate(sequence, start=0)
    Return an enumerate object.
1
2
3
4
5
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
list(enumerate(seasons))
>>[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]
list(enumerate(seasons, start=1))
>>[(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]
  • reversed(seq)
    Return a reverse iterator.
  • sorted(iterable[, cmp[, key[, reverse]]])
    In general, the key and reverse conversion processes are much faster than specifying an equivalent cmp function.
  • next(iterator[, default])
    Retrieve the next item from the iterator by calling its next() method.
  • iter(o[, sentinel])
    Return an iterator object. If the second argument, sentinel, is given, then o must be a callable object. The iterator created in this case will call o with no arguments for each call to its next() method; if the value returned is equal to sentinel, StopIteration will be raised, otherwise the value will be returned.
1
2
3
with open('mydata.txt') as fp:
    for line in iter(fp.readline, ''):
        process_line(line)
  • filter(function, iterable)
  • map(function, iterable, ...)
    Apply function to every item of iterable and return a list of the results.
1
2
3
4
list1 = ['edfeng', 'tyzhang', 'zming']
list2 = [100, 50, 80]
map(lambda name, score : name + '-' + str(score), list1, list2)
>>['edfeng-100', 'tyzhang-50', 'zming-80']

equivalent to this:

1
2
3
4
list1 = ['edfeng', 'tyzhang', 'zming']
list2 = [100, 50, 80]
[name + '-' + str(score) for name, score in zip(list1, list2)]
>>['edfeng-100', 'tyzhang-50', 'zming-80']
  • reduce(function, iterable[, initializer])
    reduce just like reduce function in couchdb, it takes a list and returns a single value.
    reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5).
  • min()/max() min(iterable[, key=func])
    min(arg1, arg2, *args[, key=func])
1
2
3
4
5
6
7
result = [('tester150411@gmail.com', '62', '624'), ('tester150411@gmail.com', '62', '528')]

print "Max time:",
print max(result, key=lambda x: int(x[2]))

print "Min time:",
print min(result, key=lambda x: int(x[2]))

Decorators

  • classmethod(function)
    A class method receives the class as implicit first argument, just like an instance method receives the instance. To declare a class method, use this idiom:
1
2
3
class C:
    @classmethod
    def f(cls, arg1, arg2, ...): ...
  • staticmethod(function)
    A static method does not receive an implicit first argument. To declare a static method, use this idiom:
1
2
3
class C:
    @staticmethod
    def f(arg1, arg2, ...): ...
  • property
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class C(object):
    def __init__(self):
        self._x = None

    # c.x will invoke this method
    @property
    def x(self):
        ""I'm the 'x' property.""
        return self._x

    # c.x = value will invoke this method
    @x.setter
    def x(self, value):
        self._x = value

    # del c.x will invoke this method
    @x.deleter
    def x(self):
        del self._x

Math

  • pow(x, y[, z])
  • sum(iterable[, start])

IO

  • print(*objects, sep=' ', end='\n', file=sys.stdout)
1
2
3
from __future__ import print_function
print('a', 'b', 'c', sep='**')
>>a**b**c
Note This function is not normally available as a built-in since the name print is recognized as the print statement. To disable the statement and use the print() function, use this future statement at the top of your module.  
  • raw_input([prompt]) Read from console
1
2
3
4
s = raw_input('--> ')
>>--> Monty Python's Flying Circus
s
>> "Monty Python's Flying Circus"

Python 杂记

| Comments

  • Print void new line
    You can use a trailing comma to avoid a newline being printed.
1
2
print "The total count is",
print total_count

Notes of Definitive Guide to Couchdb - CHAPTER 21 View Cookbook for SQL Jockeys

| Comments

Defining a View

Defining a view is done by creating a special document in a CouchDB database. The only real specialness is the _id of the document, which starts with _design/ —for example, _design/application. Other than that, it is just a regular CouchDB document.

Querying a View

/database/design/application/view/viewname

Map Function

A map function may call the built-in emit(key, value) function 0 to N times per document, creating a row in the map result per invocation.

(a map function can return any counts of emit(key, value) call)

Look up by key

To look something up quickly, regardless of the storage mechanism, an index is needed. An index is a data structure optimized for quick search and retrieval. CouchDB’s map result is stored in such an index, which happens to be a B+ tree.

(couchdb will generate a file with name xxxx.view to store the result of map/reduce function, it will copy the desired data from documents to the view file, not just reference it, so a view file may be very large)

Query: /ladies/_design/ladies/_view/age?key=5

Result:

1
2
3
4
5
6
7
{
    "total_rows":3,
        "offset":1,
        "rows":[
        {"id":"fc2636bf50556346f1ce46b4bc01fe30","key":5,"value":"Lena"}
    ]
}

total_rows: how many documents are there in this view result
offset: position/index of this document/the first document in this view result

Aggregation Function

  • Reduce functions are similar to aggregate functions in SQL. They compute a value over multiple documents.
  • Reduce functions operate on the output of the map function (also called the map result or intermediate result ).
  • This reduce function takes two arguments: a list of keys and a list of values .
  • You’ll see one difference between the map and the reduce function. The map function uses emit() to create its result, whereas the reduce function returns a value.

The result for a reduce view may looks like this:

1
2
3
4
5
{
    "rows":[
    {"key":null,"value":15}
    ]
}

Get Unique Values

The map result will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
{"total_rows":9,"offset":0,"rows":[
    {"id":"3525ab874bc4965fa3cda7c549e92d30","key":"bike","value":null},
        {"id":"3525ab874bc4965fa3cda7c549e92d30","key":"couchdb","value":null},
        {"id":"53f82b1f0ff49a08ac79a9dff41d7860","key":"couchdb","value":null},
        {"id":"da5ea89448a4506925823f4d985aabbd","key":"couchdb","value":null},
        {"id":"3525ab874bc4965fa3cda7c549e92d30","key":"drums","value":null},
        {"id":"53f82b1f0ff49a08ac79a9dff41d7860","key":"hypertext","value":null},
        {"id":"da5ea89448a4506925823f4d985aabbd","key":"music","value":null},
        {"id":"da5ea89448a4506925823f4d985aabbd","key":"mustache","value":null},
        {"id":"53f82b1f0ff49a08ac79a9dff41d7860","key":"philosophy","value":null}
    ]
}
  • Couchdb doesn’t have a keyword like distinct, but we can use group query parameter to group the result of reduce function, as we mentioned in last section, a reduce function only returns one value, but when we add group=true query parameter, multi values will be returned(each value is the reduce result of according group), grouped by the key in keys list, in this way, we can get the distinct keys(in this case, distinct tags)
  • Coucdb reduce result by default discard keys, so in the reduce result, key always be null, but if we add group=true in query, key will be used to grouping, so key will be reserved.
  • group=true parameter can only be used on a view which has reduce function.
  • if a view has a reduce function, by default a query will return the result of reduce. we can explicit pass reduce=false to get map result instead of reduce result.

Javascript集合操作

| Comments

记录一些不常见的javascript集合操作.

数组操作

1
arr = ['a', 'b', 'c']

常见的遍历数组的办法如下:

1
2
3
4
5
6
for (var index = 0; index < arr.length; index++) {
    console.log(arr[index]);
}
>>a
>>b
>>c

我们知道在javascript中,对象跟数组都是构建在类似hashtable的数据结构上的, 即它们都是基于键值对的,所以在这一点上,我们可以把数组看成跟对象是一样的,对于一个javascript object, 我们使用 for (var prop in object)这种语法来遍历一个对象中的内容,这种方式对数组一样适用:

1
2
3
4
5
6
for (var idx in arr) {
    console.log(arr[idx]);
}
>>a
>>b
>>c

实用git命令

| Comments

  • git remote add <remote-name> <remote-repo-url>
    Add another remote for current git branch and name it with given name remote-name

  • git remote rename <original-name> <new-name>
    Rename remote original-name to new-name

  • git config branch.<branch-name>.remote <remote-name>
    Set the default remote of branch branch-name to remote-name, this command will generate below segment in .git/config:

1
2
3
[branch "<branch-name>"]
    remote = <default-remote-name>
    merge = refs/heads/master

File .git/config is used to store branches and remotes information for a git repository.

  • git push <remote-name> [branch-name]
    Push branch branch-name (If not specified, default is current branch) to remote remote-name.

  • git branch -D <branch-name>
    Remove branch
  • git branch -m <orginal-name> <new-name>
    Rename branch original-name to given name new-name.

  • git checkout -b test origin/test
    Git checkout remote branch


  • git reflog
    Reflog is a mechanism to record when the tip of branches are updated. This command is to manage the information recorded in it.

    应用场景:

    1. 当前的working directory的git log如下图所示: git status
    2. 用户切换到commit 041f088去查看几天之前系统的某个功能,执行: git reset --hard 041f088
    3. 当前的working directory的git log如下图所示: git status
    4. 如果想切换回最新代码, 却发现已经不知道之前的HEAD的commit hash, 此时就可以使用git reflog命令来查看当前分支上的所有操作记录, 从中找到最新代码对应的commit hash: git statusgit reflog显示的操作记录来看,我们是从commit 4a56876 执行reset操作的,所以之前的HEAD是 4a56876, 执行git reset --hard 4a56876就可以切换回之前的代码.
  • git clean -f git clean -f -d
    Remove untracked file from working directory. If -d used, will remove untracked directories in addition to untracked files.
  • git add -p
    Interactively choose hunks of patch between the index and the work tree and add them to the index. This gives the user a chance to review the difference before adding modified contents to the index.

Xpath 与 jQuery Selector 的对应关系

| Comments

  • 如果默认是从根找起的话, xpath需要以//开头
  • 如果使用属性选择器, jQuery的属性值可以不使用引号,但是xpath中属性值必须用引号括起来

基本选择器

  • id
    • $('#import-datasenders')
    • $x('//a[@id="import-datasenders"]')
  • class
    • $('a.register_data_sender')
    • $x('//a[@class="register_data_sender"]')
  • tag
    • $('a')
    • $x('//a')
  • *
    • $('#import-datasenders')
    • $x('//*[@id="import-datasenders"]')
  • 选择器组
    • $('#import-datasenders, .register_data_sender')
    • $x('//*[@id="import-datasenders"]|//*[@class="register_data_sender"]')

层次选择器

  • 后代元素
    • $('#my_subjects li')
    • $x('//*[@id="my_subjects"]//li')
  • 直接子元素
    • $('.secondary_tab>li')
    • $x('//*[@class="secondary_tab"]/li')
  • 下一个兄弟元素

      1. $('input+image')
      2. $('input').next()
    • $x('//input/following-sibling::img')

  • 上一个兄弟结点

    • $('input').prev()
    • $x('//input/preceding-sibling::*')

    xpath的这种写法与上面的jQuery写法非常的相似,但是功能并不完全相同. jQuery 会取当前元素的 紧跟着的 下一个/上一个 符合条件的兄弟结点, 而xpath会从当前元素的所有兄弟结点中查找符合条件的,但是不保证这个兄弟结点是当前结点的下一个/上一个结点.

  • 之后的所有兄弟结点

      1. $('input~image')
      2. $('input').nextAll()
    • $x('//input/following-sibling::img')
  • 之前的所有兄弟结点

    • $('input').prevAll()
    • $x('//input/preceding-sibling::*')
  • 所有的兄弟结点
    • $('input').siblings()
    • $x('//input/following-sibling::*|//input/preceding-sibling::*')

过滤选择器

基本过滤选择器

  • :first

    • $('ul>li:first')
    • $x('//ul/li')[0]
  • :last

    • $('a:last')
    • $x('//a').pop(-1)

    jQuery伪类:first/:last返回的是结果集里面的 第一个/最后一个 元素,这已经与css selector无关了,完全是对数组的操作.

  • :even

  • :odd

  • :eq(index)

  • :lt(index)

  • :gt(index)

    上面这些选择器都是jQuery特有的,他们完全是为了简化对结果集的数组操作.

内容过滤选择器

  • :contains(text)

    • $('ul>li:contains("Download")')
    • $x('//ul/li[contains(text(), "Download")]')
  • :empty

    • $('a:empty')
    • ?
  • :parent

    • $('td:parent')
    • $x('//text()/parent::td | //*[not(text())]/parent::td')
  • :has(selector)

    • $('ul:has(a)')
    • $x('//a/ancestor::ul')

可见性过滤器

  • :hidden
    • $('input:hidden')
    • ?
  • :visible
    • $('input:visible')
    • ?

属性过滤器

  • [attribute]

    • $('div[style]')
    • $x('//div[@style]')
  • [attribute=value]

    • $('a[target=_blank]')
    • $x('//a[@target="_blank"]')
  • [attribute!=value]

    • $(':text[name!=email]')
      • $x('//input[@type="text"][@name!="email"]')
      • $x('//input[@type="text" and @name!="email"]')
  • [attribute^=value]

    • $('input[name^=tel]')
    • $x('//input[starts-with(@name, "tel")]')
  • [attribute$=value]

    • $('input[name$=number]')
    • $x('//input[ends-with(@name, "num")]')

    ends-with这个function并没有在 chrome console 跟 firebug console中测试通过,个人估计可能是因为xpath的版本问题.

  • [attribute*=value]

    • $('input[name*=tel]')
    • $x('//input[contains(@name, "tel")]')
  • [att1][att2][att3]

    • $('input[type=text][name*=tel]')
    • $x('//input[@type="text"][contains(@name, "tel")]')

子元素过滤器

  • :nth-child
    • $('ul>li:nth-child(1)')
    • $x('//ul/li[position()=1]')

xpath中的position()函数与jQuery的:nth-child伪选择器功能完全相同,而且都是base 0的.

  • :first-child

    • $('ul>li:first-child')
    • $x('//ul/li[position()=1]')
  • :last-child

    • $('ul>li:last-child')
    • $x('//ul/li[last()]')

    xpath中貌似没有first()函数

实用Shell脚本片段

| Comments

Tips
  • 赋值等号两端不能有空格
  • 使用双引号而不是单引号
  • 有时候需要明确使用${var}而不是$var
解释器选项
  • -e Exit on error
  • -x Expand command
1
#! /bin/bash -ex

条件判断
  • [ -d ~/directory ] || mkdir ~/directory 如果目录不存在,则创建此目录
  • [ -f ~/file ] || touch ~/file 如果文件不存在,则创建此文件
  • [[ str =~ substr ]]判断字符串中包含另一个字符串
1
2
3
4
5
6
7
8
function verify_md5(){
    if [[ $1 =~ $2 ]]
    then
        echo "MD5 verification SUCCESS for $1."
    else
        echo "MD5 verification FAILED for $1."
    fi
}

文件操作

想了解更多跟条件判断相关的bash命令,可以在man bash之后的查找对应的CONDITIONAL EXPRESSIONS部分.

  • sed -i .bak 's/old/new/g' file 首先生成file的备份文件file.bak, 然后将file文件中出现的所有old替换成new
  • echo "some content" >> ~/file 向文件末尾添加内容
  • mv original_file_name{,.new_suffix} 将original_file_name重命名为original_file_name.new_suffix

函数操作
  • 函数的定义与调用 风格上与Javascript相似,但是函数的定义与调用都不需要加括号,shell函数不使用括号
1
2
3
4
function main {
    echo "in main function"
}
main
  • 参数的传递 Shell函数没有参数列表这个概念, 在函数内部通过$n来接受第n个位置上的参数
1
2
3
4
5
6
7
8
9
10
function greeting {
    echo "Hello, $1, your id is $2"
}

function main {
    greeting "Feng Erdong" edfeng
    greeting "Zhang xiaoming" xmzhang
}

main

Python中的时间与日期

| Comments

  • dateutil.parser.parse 将一个日期字符串转换为带时区信息的日期对象
1
2
3
4
5
6
7
8
9
10
11
from dateutil.parser import parse

date = parse("08.10.2012" + " 00:00:00+0000", dayfirst=True)

>>>date
>>>datetime.datetime(2012, 10, 8, 0, 0, tzinfo=tzutc())

date = parse("08.10.2012" + " 00:00:00+0000", dayfirst=False)

>>>date
>>>datetime.datetime(2012, 8, 10, 0, 0, tzinfo=tzutc())

如果使用的python版本是2.x, 需要使用的python-dateutil的版本是1.x, python-dateutil 2.x版本需要python3.x支持

  • datetime.timedelta 在日期对象上添加偏移量
1
2
3
4
5
6
7
8
9
10
11
12
13
14
from datetime import timedelta

>>>date
>>>datetime.datetime(2012, 8, 10, 0, 0, tzinfo=tzutc())

date = date + timedelta(days=1)

>>>date
>>>datetime.datetime(2012, 8, 11, 0, 0, tzinfo=tzutc())

date = date + timedelta(hours=1)

>>>date
>>>datetime.datetime(2012, 8, 11, 1, 0, tzinfo=tzutc())
  • datetime.datetime.strptime 将日期字符串转换为日期对象
1
2
3
4
5
6
from datetime import datetime

date = datetime.strptime('2012.08.10', '%Y.%m.%d')

>>>date
>>>datetime.datetime(2012, 8, 10, 0, 0)