猴年马月: February 2011

Friday, February 25, 2011

关于多线程死锁

从CSDN上面看来的讨论，很有启发性：
－－－
A_lock()
B_lock()
do_something()
A_unlock()
B_unlock()

是否能产生死锁，
死锁的条件是什么？

#1 by zhangjingyuhua

这样看不出来。
但是考虑下面这种情况：

void f1()
{
    A_lock();
    do_something(); #P1
    A_unlock();
}

void f2()
{
    B_lock();
    f1(); #P2
    B_unlock();
}

void do_something()
{
    B_lock();
    B_unlock();
}

例如两个线程，线程A调用f1,线程B调用f2。如果线程A执行到P1位置，线程B执行到P2位置。那么就死锁了。

A进入do_something且A获得了A_lock，里面有B_lock()，但是线程B已经获得了B_lock，而线程B这时却在等待f1中的A_lock。
这样就死锁了。

#2 by Jinhao

别跑题，是在一个函数中。

func()
{
    lock(A)
    lock(b)
    do something()
    unlock(b)
    unlock(A)
}

加锁顺序是：A->b
解锁顺序是：b->A

把题看清再说话。

#3 by zhangjingyuhua

麻烦你看清楚。已经说了，看不出来。
我写那段代码是回答你“死锁的条件是什么”这个问题。这与是不是在一个函数中无关。明白不？

如果你懂了，那你还跑到这里来问什么？

func()
{
    lock(A)
    lock(b)
    do something()
    unlock(b)
    unlock(A)
}

你觉得这个没问题？光看这个有什么用？

如果存在另一个函数：

func2()
{
    lock(B)
    lock(A)
    do something()
    unlock(A)
    unlock(B)
}

那你认为你的func有没有问题？

#4 by Jinhao

我在用别人的时，别人不应该在等我。
满足这一条就不会死了。

A_lock()
B_lock()
do_something()
A_unlock()
B_unlock()

除非do_something有对A_lock操作。

#5 by lin_style

单纯地这么看，不会有死锁。
其次，如果do_something里面会调用A_lock，则要看A_lock的实现。
最后，要保证正确操作锁的顺序，后入的锁要先出。

A_lock()
B_lock()
do_something()
B_unlock()
A_unlock()

#6 by Jinhao
－－－
来源：http://topic.csdn.net/u/20090429/19/fffd3f89-6cea-4859-a57b-bd84c04ead96.html。我觉得最有用的是以下这么两句。

一、我在用别人的时，别人不应该在等我。满足这一条就不会死了。
二、要保证正确操作锁的顺序，后入的锁要先出。

mutex deadlock

Problems with Mutexes

An important problem associated with mutexes is the possibility of deadlock. A program can deadlock if two (or more) threads have stopped execution or are spinning permanently. The simplest deadlock situation: thread 1 locks lock A, thread 2 locks lock B, thread 1 wants lock B and thread 2 wants lock A. Instant deadlock. You can prevent this from happening by making sure threads acquire locks in an agreed order (lock ordering). Deadlock can also happen if threads do not unlock mutexes properly.

Race conditions occur when multiple threads share data and at least one of the threads accesses the data without going through a defined synchronization mechanism (Nichols 203). This could result in erroneous results even in an inconsistent manner which makes race conditions particularly difficult to debug. Library calls outside of your program's control are common culprits. Make sure you take steps within your program to enforce serial access to shared file descriptors and other external resources. On most Solaris man pages, you can find out if your library call is safe to use in reentrant code. Towards the bottom of the man page, you will see Categories of MT Library Calls. MT Safe means that the function can be called concurrently from different threads. MT Hot are "fast" MT Safe functions (usually not found on man pages). MT Unsafe means that the function cannot be called concurrently. Alternative means that there are MT Safe equivalents (e.g. gethostbyname() and gethostbyname_r()).

Another problem with mutexes is that contention for a mutex can lead to priority inversion. A higher priority thread can wait behind a lower priority thread if the lower priority thread holds a lock for which the higher priority thread is waiting. This can be eliminated/reduced by limiting the number of shared mutexes between different priority threads.

from http://randu.org/tutorials/threads/

言简意赅的一段，不错。

回来后又接到一个电话

是Bloomberg的电话面试，问一些C++的问题。不是很难，在《Accelerated C++》里面都有谈到过。但毕竟我是看书学习，没有什么hands-on exercise的经验，所以答得磕磕绊绊。最后那个面试者就说你不用过来面试了，因为Bloomberg expects quick and correct answers.

:-(

一天fail掉两个面试，看来还得加强基本功。没事要少看看围棋网站了。

主要跌跤的地方：

1. When is the copy constructor necessary?
我回答的是initialization的时候需要（这个对），还有就是convert的时候需要（这是错的，convert需要一个constructor，但不是copy constructor）。正确答案是：
－－－
11.3.6 The rule of three

Classes that manage resources such as memory require close attention to copy control. In general, the default operations will not suffice for such classes. Failure to control every copy can confuse users of the class and often will lead to run-time errors.

Consider our Vec class, but pretend that we did not define the copy constructor, assignment operator, or destructor. As we saw in §11.3.1/195, at best we will surprise our users. Users of Vec will almost surely expect that once they've copied one Vec into another, the two objects will be distinct. They will expect that operations on one Vec will not have any effect on the data held by the other.

Even worse, though, is that if we do not define a destructor, then the default destructor will be used. That destructor will destroy the pointer, but destroying a pointer does not free the space to which it points. The result will be a memory leak: The space consumed by Vecs will never be reclaimed.

If we fix the leak by providing a destructor, but we do not also add the copy constructor and assignment operator, then we set things up so that a crash is likely. In such a flawed implementation, it would be possible for two Vecs to share the same underlying storage, as we illustrated in the first diagram in §11.3.1/196. When one of those objects is destroyed, the destructor will destroy that shared storage. Any subsequent reference through the undestroyed copy will lead to disaster.

Classes that allocate resources in their constructors require that every copy deal correctly with those resources. Such classes almost surely need a destructor to free the resources. If the class needs a destructor, it almost surely needs a copy constructor, as well as an assignment operator. Copying or assigning objects of classes that allocate resources usually allocates those resources in the same way that creating an object from scratch does. To control every copy of objects of class T, you need

    T::T(); one or more constructors, perhaps with arguments
    T::~T() the destructor
    T::T(const T&) the copy constructor
    T::operator=(const T&) the assignment operator

Once we have defined these operations, the compiler will invoke them whenever an object of our type is created, copied, assigned, or destroyed. Remember that objects may be created, copied, or destroyed implicitly. Whether implicitly or explicitly, the compiler will invoke the appropriate operation.

Because the copy constructor, destructor, and assignment operator are so tightly coupled, the relationship among them has become known as the rule of three: If your class needs a destructor, it probably needs a copy constructor and an assignment operator too.
－－－
From Andrew Koenig's 《Accelerated C++》

2. Can a copy constructor be virtual?
正确答案：No. （原因尚不确切）Only destructors can be virtual.

3. When do we need a virtual destructor?
正确答案是：
－－－
Dynamic binding refers to the ability to select at run time which function to run based on the actual type of the object on which the function is called. Dynamic binding is in effect for calls to virtual functions made through a pointer or a reference. The fact that a function is virtual is inherited, and need not be repeated in the derived classes.

Derived classes are not required to redefine their virtual functions. If a class does not redefine a virtual, then it inherits the nearest definition for that function. However, any virtual functions that the class does contain must be defined. It is often a source of mysterious error messages from compilers to declare but not define a virtual function.

Overriding: A derived-class member function overrides a function with the same name in the base class if the two functions have the same number and types of parameters and both (or neither) are const. In that case, the return types must also match, except that as in §13.4.2/246, if the base-class function returns a pointer (or reference) to a class, the derived-class function can return a pointer (or reference) to a derived class. If the argument lists don't match, the base- and derived-class functions are effectively unrelated.

virtual destructors: If a pointer to the base class is used to delete an object that might actually be a derived-class object, then the base class needs a virtual destructor. If the class has no other need for a destructor, then the virtual destructor still must be defined and should be empty:

    class base {
    public:
        virtual ~base(){ }
    };

As with any other function, the virtual nature of the destructor is inherited by the derived classes, and there is no need to redefine the destructor in the derived classes.
－－－
Also from 《Accelerated C++》. 其实这第三个问题我知道什么时候需要一个destructor，就是总是无法精确地表达这层意思，关键还是没做过实际的C++项目，印象不深刻。

今天有个Qualcomm的面试

是第一轮，直接和hiring manager谈的。别的都还好，有两个问题。
1. White-board coding of factor function.
---
int factor(int n)
{
    int result;

    if (n == 1)
        return 1;
    else if (n > 0)
        result = factor(n - 1); /* should be result = n * factor (n - 1); */
    else
    {
        printf("input must be positive");
        return -1;
    }

    return result;
}
---
惭愧的是，这么个简单的白板写code都没有答对，非要等到manager说“Where is the multiplication?”我才想起来。

2. What is a thread deadlock?
这是我的死穴，以前工作中没有遇到过，也没有什么这方面的知识。一定要补一下。看来以后空闲的时候不能光看围棋，水木的C/C++版面也得经常去学习一下。要不碰到面试还真的没辙。

估计这次面试后，进入第二轮的希望不大。

Friday, February 11, 2011

static member in a class?

想起来Scott那天还问道static member in a class和non-static member有什么不同。前几天一直在看《Accelerated C++》，对这个有印象，就回答static member is not associated with any instance/object of the class（Section 13.4, Page 244~245）。

但是Scott又继续追问：“这个性质有什么作用？”我立刻就有些傻眼，当时书上好象是谈到这样做的话，外面的程序call这个static member func的时候，前面要加一个classi qualifier，比如这样：sort(students.begin(), students.end(), Student_info::compare)。好处是For our purposes, static member functions have one significant advantage: Their names are within the scope of their class. So, when we say that compare is a static member we are defining a function named Student_info::compare. Because the function has a qualified name, it does not overload the compare that we used to compare Core objects. Thus, our users will be able to call sort, passing Student_info::compare, and the compiler can know which function they want.

更详细的描述可以见于Section 15.2.3, Page 282~283：
－－－
15.2.3 Padding the output

We can now think about our padding function. Because we want to access this function from each of the derived classes, we'll define the pad operation as a function member of Pic_base that is both static and protected:

class Pic_base {
    // as before
protected:
    static void pad(std::ostream& os, wd_sz beg, wd_sz end) {
        while (beg != end) {
            os << " ";
            ++beg;
        }
    }
};

This function takes an ostream on which to write blanks, and two values that control how many blanks to write. When a display function needs to call pad, it will pass the current column number and one past the last column number that needs to be filled in by the current display operation. The pad function will fill this range with blanks.

Note the use of the static keyword on the declaration of pad. As we saw in §13.4/244, this use of static indicates that pad is a static member function. Such functions differ from an ordinary member function in that they are not associated with an object of the class type.

It may also be surprising that we can define a member function for an abstract base class. After all, if there can be no objects of the base class, why should there be member functions? However, remember that each derived object contains a base-class part. Each derived class also inherits any member functions defined in the base. Thus, a base-class function will execute on the base-class portion of a derived object. In this particular case, the function that we are defining is a static member, so the question of access to members of the base is moot. But it is important to realize that abstract classes may define data members, and (ordinary) member functions, as well as static ones. These functions will access the base-class objects that are part of derived objects.

Static members (both functions and static data members, which we can also define) are useful in that they let us minimize the names that are defined globally. Our pad function is a good example. We can imagine many abstractions that have the notion of padding. In this book we talked about padding in the context of writing a formatted report of student grades, as well as in the context of writing Pictures. If the Picture class were to define pad as a global function, then we would not also be able to define a pad function for Student_info, or vice versa. By making pad a static member, we allow for the fact that other abstractions in our program might have the notion of padding. As long as each class defines what pad means only in the context of the class, these mutually independent notions of padding can coexist within our program.
－－－
简言之，static member的好处就是可以minimize the names that are defined globally。但是当时没能明确地说出这一点，还是概念模糊所导致的。

Thursday, February 10, 2011

想起来昨天有几个问题没有答好

1. 在C里面，string的data structure是什么？

我想来想去，C的string不就是用char *来表示的吗？比如以下的code：
－－－
char * pStr ＝ malloc(SOME_LEN + 1); /* + 1 for the ending 0x00 byte */
strcpy(pStr, ...);
－－－
不就是一块continuous memory area吗？

Scott似乎对这个答案不满意，但我实在也没想起来是什么。后来他告诉我C里面string是通过character array实现的。嗯，从数据结构的角度讲是这样的，不知为什么我总是没想起来。不过后来他问我这个array到什么时候终止我倒是答上来了，terminated with null character (0x00)，这个简单。

2. 同样是在C语言里面，除了strcat()以外，有没有其他concatenate string的方法？

我当时只能想出strcat()的方法，比如下面的帖子：
－－－
In C, strings are just char arrays. So you can't add to them directly.

Here is an example from cplusplus.com:

char str[80];
strcpy (str,"these ");
strcat (str,"strings ");
strcat (str,"are ");
strcat (str,"concatenated.");

So in your example, you would want:

char *foo = "foo";
char *bar = "bar";
char str[80];
strcpy (str, "TEXT ");
strcat (str, foo);
strcat (str, bar);

The return value of strcat can just be ignored, it returns the same pointer as was passed in for the first argument. You don't need this value. The return value is simply for convenience, so it allows you to chain the calls into one line of code.

For the first parameter, you need to have the destination buffer itself. The destination buffer must be a char array buffer. Example char buffer[1024];

Be very careful that the first parameter has enough space for what you're trying to copy in. If available to you, it is safer to use functions like: strcpy_s and strcat_s which explicitly specify the size of the destination buffer.

You can never use a string literal as a buffer, you must always use your own buffer.

by Brian R. Bondy in http://stackoverflow.com/questions/308695/c-string-concatenation
－－－
对我提的strcat()的答案，Scott应该是不满意的，但他当时也没有说成还有什么别的方法。

今天早上去跑步的时候，想到这个问题还是应该再琢磨一下，就去网上搜索了一下，看到下面的帖子：
－－－
Avoid using strcat. The cleanest and, most importantly, the safest way is to use snprintf:

char buf[256];
snprintf(buf, sizeof(buf), "%s%s%s%s", str1, str2, str3, str4);

by Alex B in http://stackoverflow.com/questions/308695/c-string-concatenation

sizeof() only works here for char buf[...]. NOT for char * buf = malloc(...). There aren't many differences between arrays and pointers, but this is one of them!

by Mr.Ree in http://stackoverflow.com/questions/308695/c-string-concatenation

Here's my problem: I have an array, which contains a command a[1], followed by several command args a[2], a[3], ...

What I need to do is the following

    •Create a string, consisting of the cmd and a combination of args e.g.: cmd arg1 arg2 arg3
    •Execute that command-string

Here's how I woud do it (pseudo-code):

    1.Precompute the length of each arg and store it in an array
    2.Get a combination (using GNU Scientific Library)
    3.Compute the size in bytes needed to allocate the string (length of cmd + 1 + lengthof arg1 + 1 + argn-1 + 1) (the +1 generally for the for the blank and at the end for the \0)
    4.Build the string by using strcat
    5.Execute the command-string

Well, it works but I wonder if using strcat that deliberately is actually efficient/the right way to do it. Any suggestions?

by Alok in http://stackoverflow.com/questions/2272418/efficient-string-concatenation-in-c

No, using strcat() is not efficient since it has to step through the string to find the end each time you call it.

Much better to either do it all at once using snprintf() if you have it (and can squeeze your arguments in there), or do it yourself using direct pointer manipulation.

Of course, for this to matter in practice you need to be running this command very often indeed.

by unwind in http://stackoverflow.com/questions/2272418/efficient-string-concatenation-in-c

strcat(), as well as all string manipulation functions from the standard library, is inefficient. this is due to way strings are stored in C, namely zero-terminated, thus each function has to find the end of the string by iterating over each character.

anyway, you are doing a premature optimization: the multiple strcat() calls here will execute really fast compared to the command execution, so you should not worry about the efficiency of your way of concatenating.

before optimizing part of a code, you have to show that it is a bottleneck and that optimizing will really improve execution time. in most cases, there is no need to optimize: it is simply not worth the time spent.

by Adrien Plisson in http://stackoverflow.com/questions/2272418/efficient-string-concatenation-in-c
－－－
哈，看到这些帖子我才想起来，以前我在写code的时候，确实在strcat()以外，还用过snprintf()来concatenate strings，昨天电话面试的时候，没想起来的原因可能是因为有一段时间没写code了，多少有些生疏所致。

Wednesday, February 9, 2011

跟Scott谈完之后，发现还有一个missed call

来自Oasis的Jon Mckee（http://www.oasiscorporation.com/contactus.html），不过他没有问太多，好象主要是探询一下我的visa status，知道我需要sponsorship之后，说这是个小公司，可能不赞助H1-B，不过无论如何，他会去check一下。

我提了持新加坡护照，以及可以做volunteer的事情，不过估计希望不大。

即使这样，一天能有两个电话，以前是没有过的，看来就业市场真地回暖了？

今天有个Phone Screen

没想到大名鼎鼎的SlickEdit，业界和UltraEdit齐名的编辑器公司，也在Morrisville的范围内。

Scott Westfall，这个公司的技术主管（Vice President）跟我打的电话，谈了有一个多钟头。C的问题基本还行，没有答出来的是关于queue这个数据结构的常用操作比如insert、remove这些。C++的问题，拜Andrew Koenig的《Accelerated C++》所赐，基本也答出来了，主要的问题是static member in a class的用处、virtual function、pure virtual function、dynamic binding、polymorphism、virtual destructor的价值、还有如果class里面含有指针，那么在这个class被pass去一个function的时候，两个class instances里面的指针会指向同一块memory area，这种情况需要在copy constructor和assignment operator中重新分配内存，再把内存值copy过来。

对了，还有一个问题没有答上来。他让我说出三个sorting algorithm，我想来想去，只想到两个，一个是binary sort，另一个是quicksort。第三个怎么也没有想出来。其实第一个就不对了，只有binary search，没有binary sort。详细的参考，在Wiki上有：http://en.wikipedia.org/wiki/Sorting_algorithm

Scott说这整个礼拜都会忙着phone screen，下个礼拜早些时候，可能礼拜二会给我答复。

虽然也答错了几个题，但到目前为止，这个phone interview是感觉最好的一次。尤其是当我答出缺少virtual destructor会导致memory leak的时候，Scott很满意，说他问的人少过5%能说出virtual destructor的用处及价值。

希望能有好运。

Friday, February 4, 2011

马克思和恩格斯

整个一月份，我基本都是在录入棋谱，完成了《棋经众妙》（林元美编著）与《围棋死活辞典》（上册，赵治勋编著）的SGF格式棋谱。

感觉非常充实，比整天海投简历却没有回音那种落寞要强多了。虽然录入棋谱的时候没有工作，但感觉却像是当年写作《资本论》的马克思，被恩格斯养着完成一项伟大的工作。

笑谈，呵呵。

猴年马月