笔记：C++ Concurrency in Action

 * 阅读进度：pdf161/530, 书p138,05-21
 * 阅读进度：pdf192/530, 书p169,05-22
 * 阅读进度：pdf210/530, 书p187,05-23
 * 阅读进度，pdf251/530，书p228,20-05-24

# C++ Concurrency in Action

## 1 引述
##### 1.1 concurrency是啥 
##### 多进程、多线程
###### 多进程
* 采用os提供的进程间通信，不同进程隔离多

###### 多线程
* 都属于一个进程，可互访资源多
* c艹11之前无多线程模块，c艹11仍无多进程模块。没的时候，需要用os提供的模块
* 此书谈多线程。

##### 1.2 用多线程原因
###### 1. 不同线程处理不同事物。如ui线程、后台线程
###### 2. 提高效率
* 多线程跑不同部分 task parallelism
* 多线程跑不同数据 data parallelism

##### 1.3 何时不用多线程
* 多线程难写难读，易出bug

##### 1.3 c艹的concurrency&&multithreading
###### 1.3.1 历史
* c艹98无多线程，需使用os提供的库，或使用封装好的三方库如boost
* 因为c艹98标准的内存模型不支持，封装好的三方库仍可能造成一些问题。如不同平台行为不一样


###### 1.3.2 c艹11标准对concurrentcy的支持
* 多线程新特性
	* 新内存模型
	* 管理线程 <- 第二章
	* 保护sharing data <- 第三章
	* 线程间同步 <- 第四章
	* low-level atomic operations <- 第五章

* 原子操作新特性
	* 不用再写 与特定平台有关的 汇编代码 

###### 1.3.3 C艹线程库的效率
* 设计思想是
	* 减少抽象带来的性能损失，以至于用那些更低级api的api带来的性能提升少且麻烦

* 抽象程度高的类，虽回带来cost，但cost和自己手写实现这些差不多。编译器也可能内联优化之类的。
* 设计程序减少竞争 <- 第八章

##### 1.4 hello world
* 知识点
	* <thread> <- 管理线程
	* 保护sharing data <- 其他头文件
	* 每个线程 需有 初始化函数
 	* thread_name.join() 让calling thread待结束

## 2 管理线程
###### 涉及
* 多种初始化线程的方式
* 等待线程结束 和 让线程跑起来
* uniquely identifying threads

#### 2.1 线程管理基操
##### 2.1.1 跑起线程
* callable的都可以
	* 如实现function call operater的类

```
class A {
public:
    void operator () () const {
       
    }
}
// 初始化
A a;
thread t (a);

x thread t (A());
thread t ( (A()) ) // 防止被解释成函数声明
thread t {A()} // 新的uniform initialization syntax
thread t ([]{ cout<<"hi"; });
```

* 声明线程之后，必须 join()或detach(), 否则程序terminated. <- std::thread destructor calls std::terminate()
	* 只需要在线程销毁被自动销毁之前(线程是局部变量，因此会被自动销毁)，调用join/detach即可阻止[线程在析构函数中调用terminate()] 
	* 一旦detach(),此线程的引用失效，不可再join(<-否则会runtime error)。owernership and control are passed over to the C++ Runtime Library
	* 确保以防 exceptions 让join()/detach()没有执行。
	* 若detach, 需自行确保此线程访问的变量一直有效。
		* 若thread由callable object 构造，则thread复制这个callable object, 因此原先的可以销毁。
		* 若thread含有 局部变量 的指针或引用，则可能是产生悬垂指针那种现象。
			* thread可通过复制外部的局部变量，来保障 

##### 2.1.2 Waiting for a thread to complete
* 常常用于：启动多个线程，等待这些完成。
* join()会 清理thread有关的存储。
	* 因此join()仅可调用一次。joinable()返回是否可join 
* 若想
	* 判断thread是否完成
	* 等待一段时间
	* 则需用其他机制，如condition variables、futures 

##### 2.1.3 waiting in exceptional circumstances
* 在catch中join()
* 若一定要join() // 为了保证函数退出时，线程也退出
	* 则可使用一个类 包装thread, 在这个类的析构函数中join(）。这个类存储thread的引用。

##### 2.1.4 running thread in the background

#### 2.2 Passing arguments to a thread function
* 参数默认被copy
	* 因此thread构造不能正确处理 【参数为引用】 
		* 用std::ref 
* thread构造器,先copy传入的参数，接着在thread的context下才进行转换。

* hack操作

```
class A {
public:
    void f(int);
};
A a;
int i=1;
thread t(&X::f, &a,i);
```

* thread moveable,but aren't copyable

#### 2.3 Transferring ownership of a thread
* 若thread t已有线程相关联，则t=move(t2)/*t2也与线程相关联*/将会使 t调用terminate();

```
you cann't just 'drop' a thread by assigning a new value to the std::thread object that manages it;
```

* return a thread from a function

```
thread f() {
    void f1();
    return thread(f1);
}
thread f() {
    void f1(int);
    thread t(f1, 1);
    return t;
}
```


* pass a thread to function call

```
voif f1();
// moving from temporary object is automatic and implicit
f(thread(f1));
f(move(t));
```

* scoped_thread

```
class scoped_thread {
    std::thread t;
public:
    explicit scoped_thread(std::thread t_) : t(std::move(t_)) {
        if (!t.joinable()) {
            throw std::logic_error("No thread");
        }
    }
    ~ scoped_thread() {
        t.join();
    }
    scoped_thread(scoped_thread const&) = delete;
    scoped_thread& operator=(scoped_thread const&) = delete;
}
```

* 线程池因move得以实现
	* 如用vector 

```
for (auto i=0; i<5; ++i) {
    vs.push_back(thread(f1, i));
}
// call join() on each thread in trun;
for_each(vs.begin(),vs.end(),std::mem_fn(&thread::join))
```


#### 2.4 Choosing the number of threads at runtime
* because you can not return a value directyly from a thread
	* you  pass in a reference.
	* use futures

* get the number from hardware

```
usigned long long const  hardware_threads = std::thread::hardware_concurrency();
```

#### 2.5 identifying threads
* std::thread::id
	* t.get_id()
	* std::this_thread::get_id();
* can be copied and compared
* std::hash<std::tread::id> 

## chapter3 sharing data between threads
###### 涉及
* problems with sharing data between threads
* protecting data with mutexes
* alternative facilities for protecting shared data

### 3.1 problems with sharing data between threads
* the problems with sharing data between threads are all due to modifying data.
* invariants
	* doubly linked list:
		* 若A的next为B，则B的prev为A
			* 删除节点过程中，invariants可能被打破   

	* the simplestpotential problem with modifying data that's shared between threads is that of broken invariants.

#### 3.1.1 Race conditions
* a race condition is anything where the outcome depends on the relative ordering of executing of operations of operations on two or more threads.
	* When talking about concurrency, the term *race condition* is usually used to mean a *problematic* race  condtion;  
	* *data race* mean the specific type of race condtion that arises because of concurrent modification to a single object.
		* *data race* cause *undefined behavior* 
* it's when the race condition leads to broken invariants that there is a problem.
	* problematic race conditions typically occur where completing an operation requires modifications of two or more distinct pieces of data
		* in these scenes, because data must be modified in separate instructions, another thread could potentially access the data structure.

#### 3.1.2 Avoiding problematic race conditions
* several ways:
	* 类似transaction这种思想 
	* *lock-free programming*. modify the design of design of data sructure and its invariants.

* most basic mechannism for protecting shared data provided by  the  c++ is *mutex*

### 3.2 Protecting shared data with mutexes
#### 3.2.1 Using mutexes in c++
* mutex
	* 不推荐直接用。因需手动unlock(在出现exception之类的情况下
	* 可用std::lock_guard 

```
/*
lock_guard似unique_lock,但unique_lock可以自由lock()、unlock(),而lock_guard只可在构造时lock
*/
std::lock_guard<std::mutex> guard(mtx);

std::lock(lhs.mtx);
// 上一句lock已经锁了
std::lock_guard<std::mutex> guard(hs.mtx,std::adopt_lock);
```

    * in the majority of cases it's common to group the mutex and the protected data together in a class rather than use global variables. if all the member functions of the class lock the mutex before accessing any other data members and unlock it when donw,  

#### 3.2.2 Structuring code for protecting shared data
* nonename
	* 成员函数不返回 指针和引用
	* 成员函数不把 指针和引用 传给调用的函数 

#### 3.2.3 Spotting race conditions inherent in interfaces
###### 出现举例

```
如给栈加个大锁。
当使用者调用 size()/empty() 获得结果后，而根据返回的信息进行动之前，栈的状态可能已经改变。

pop()时，栈内部存储数据结构已经pop了，但把值返回过程中出现了exception。则导致数据永远丢失。
```

###### race condition in interface.
	* 多个线程对一个类的不同接口的紧密调用产生 
	* the solution is to change the interface

* 支持多线程的栈 
	* option 1 Pass in a reference
		* 需要可无参构造
		* 需要可赋值
			* many user-defined types do not support assignment,but support move consruction or copy construction

	* option2: require a no-throw copy constructor or move constructor

```
std::is_nothrow_copy_constructible
std::is_nothrow_move_constructible
```

   * option3: return a pointer to the popped item
   	* shared_pt
   * pop时再检查一遍是否为空。为空则throw exception 

###### note
* problematic race conditiosn in interfaces essentially arise because of locking at too small a granulaity; the protection does not cover the entriety of  the desired operation.

#### 3.2.4 Deadlock: the problem and a solution
* advice
	* always lock the two mutexes in the same order.比如总先锁A a，再锁 

```
但：
某函数f(A,A),
则若两个线程，一个f(a,b),一个f(b,a)，则也会死锁。
```

###### lock()锁多个
* 锁不住所有时，会抛出异常并释放所有锁

```
lock(mtx1, mtx2);
lock_guard<mutex> lg1(mtx1, adopt_lock);
lock_guard<mutex> lg2(mtx2, adopt_lock);
```

#### 3.2.5 防止死锁的更多指导
###### 1. 死锁出现的其他情况
* 两个thread互相调用对方的join()
* 多个thread...

##### 2. guideline
* do not wait for another thread if therer is a chance it's waiting for you.

###### 2.1 一个线程只用一个锁
* 如果需用多个锁，用std::lock()一次性获取

###### 2.2 avoid calling user-supplied code while holding a lock

###### 2.3 acquire locks in a fixed order
* 若 需要锁多个 且 不可通过std:lock()一次锁完。则最好在所有线程中 以同样顺序锁他们。

##### 2.4 use a lock hierarchy
* 给mutex一个layer number,记录当前线程所有已锁mutex的layer number。
	* it is not permitted to lock that mutex if it already holds a lock from a lower layer.

###### user-defined lock
* implement three member functions
	* lock()
	* unlock()
	* try_lock()
		* if the lock on the mutext is held by another thread. it returns false rather than waiting until the calling thread can acquire the lock on the mutex.
	
```
class hierarchical_mutex {
    std::mutex internal_mutex;
    // 这个mutex所属的层级
    unsigned long const hierarchy_value;
    // 上一次锁这个mutex的线程层级
    unsigned long previous_hierarchy_value;
    // 当前线程的层级
    static thread_local unsigned long this_thread_hierarchy_value;
    void check_for_hierarchy_violation() {
        if (this_thread_hierarchy_value <= ierarch_value) {
            throw std::logic_err("mutext hierarchy violated");
        }
    }
    void update_hierarch_value() {
        previous_hierarchy_value = this_thread_hierarchy_value;
        this_thread_hierarchy_value = hierarchy_value;
    }
public:
    explicit hierarchical_mutex(unsigned long value): 
    hierarchy_value(value),
    previous_hierarchy_value(0) {}
    void lock() {
        check_for_hierarchy_violation();
        internal_mutex.lock();
        update_hierarchy_value();
    }
    void unlock() {
        this_thread_hierarchy_value = previous_hierarchy_value;;
        internal_mutex.unlock();
    }
    bool try_lock() {
        check_for_hierarchy_violation();
        if (!internal_mutex.try_lock()) 
            reutrn false;
        update_hierarchy_value();
        return true;
    }
};
thread_local unsigned long hierarchical_mutex::this_thread_hierarchy_value(ULONG_MAX);
```

###### Extending these guidelines beyongd locks
* 扩展到如thread互相等待对方完成之类的情况。
	* a thread waits only for threads lower down the hierarchy.
		* ensure that your threads are joined in the same function that started them.  
* it is bad idea to wait for a thread while holding a lock.
* once you have designed your code to avoid deadlock, std::lock() and std::lock_guard cover most of the cases of simple locking.
	* std::unique_lock provite more flexibility. 


#### 3.2.6 Flexible locking with std::unique_lock
* std::unique_lock(mtx, std::defer_lock)
	* indicate the mutex should remain unlocked on construction
	* can then 
		* lock()
		* passing the unique_lock object itself to std::lock()

```
std::unique_lock<std::mutex> lock_a(mtx1, std::defer_lock);
std::unique_lock<std::mutex> lock_b(mtx2, std::defer_lock);
std::lock(lock_a, lock_b);

bool b = lock_a.owns_lock();
```

* 可以及时unlock()来提高性能。


#### 3.2.7 Transferring mutex ownership between scopes
* std::unique_ptr movable but not copyable
* a function lock a mutex and transfer ownership of that lock to the caller.

```
unique_ptr<mutex> get_lock() {
    extern mutex mtx;
    unique_lock<mutex> lcx(mtx);
    return lcx;
}
int main() {
    unique_lock<mutex> lcx(get_lock());
    
}
```

#### 3.2.8 Locking at an appropriate granularity
* the lock granularity is to describe the amount of data protected by a single lock.
* locking at an appropriate granularity is not only about the amount of data locked; it is also about how long the lock is held and what operations are performed while the lock is held.
* a lock should be held for only the minimum possible time needed to perform the required operations.
	* time consuming operations such as acquiring another lock or waiting for I/O to complete shouldnot be done while holding a lock unless absolutely necessary.
* if you do not hold the required locks for the entire duration of an operaton, you are exposing yourself to race conditions.
* when not all accesses to the data structure require the same level of protection, we need alternative mechanism instead of plain std::mutex

### 3.3 Alternative facilities for protecting shared data
* mutex are not the only game in town when protecting shared data. there are alternatives that provide more appropriate protection in specific scenes.

#### 3.3.1 仅创建时需要保护
* 错误：Double-Checked Locking
	* 可能p已经被写，但resource没初始化好 

```
void f() {
    if (!p) {
        std::lock_guard<std::mutex> lk(resource_mutex);
        if (!p) {
            // init resource
        }
    }
}
```

###### 最佳实践
* std::call_once
	* works with any function or callable object.

```
std::shared_ptr<resource> res_ptr;
std::once_flag res_flag;
void init_res() {
    res_ptr.reset(new res);
}
void f1() {
    std::call_once(res_flag, init_res);
    res_ptr->f2();
}
```

* used for lazy initialization of class members.

```
class A {
    std::once_flag flag;
    resource res;
    void init_res () {
        res = Resource();
    }
    void f1() {
        std::call_once(flag, &A::init_res, this);
    }
}
```

* std::once_flag can not be copied or moved, like std::mutex

#### 3.3.3 Protecting rarely updated stata stuctures
* if any thread has a shared lock, a thread that tries to acquire an exclusive lock will block untill all other threads have released their locks.
* if any thread has a exclusive lock, no other thread may acquire a shared or exclusive lock untill the first thread has released its lock.

```
#include <map>
#include <string>
#include <mutex>
#include <boost/thread/shared_mutex.hpp>
class dns_entry;
class dns_cache {
    std::map<std::sring, dns_entry> entries;
    mutable boost::shared_mutex entry_mutex;
public:
    dns_entry find_entry(std::string const & domain) const {
        boost::shared_lock<boost::shared_mutex> lk(entry_mutex);
        std::map<std::string, dns_entry>::const_iterator const it = entries.find(domain);
        return (it == entries.end()) ? dns_entry() : it->second;
    }
    void update_or_add_entry(std::string const & domain, dns_entry const & dns_details) {
        std::lock_guard<boost::shared_mutex> lk(entry_mutex);
        entries[domain] = dns_details;
    }
};
```

#### 3.3.3 Recursive locking
* std::recursive_mutex
	* can acquire multiple locks on a single instance from the same thread.
	* call unlock() as many times as calling lock() to release.
		* std::lock_guard<std::recursive_mutex>
		* std::unique_lock<std::recursive_mutex> 
	* most of the time, if you think you want a recursive mutex, you probably need to change your design instead.
	* 最常见使用场景：quick-and-dirty:给一个类加把大锁，每个成员函数中都先lock这个mutex,有些成员函数会调用其他成员函数，这时就用recursive_mutex. 因为一个mutex不可以被锁两次，所以需要recursive_mutex
		* 但最好是抽离出这两个函数公共的，形成另外一个不调用lock()/* 因为调用这个新的成员函数的 成员函数已经调用过lock() */的private成员函数。 

## chapter4 Synchronizing concurrent operatons
* covers: 
	* waiting for an event.
	* waiting for one-off events with futures
	* waiting with a time limit
	* using synchronization of operations to simplify code

* 常常一个thread需要等待另外一个thread
	* 虽可用shared data法：类似whilte(!finished);
	* 最佳实践：
		* condition variables
		* futures 

### 4.1 Waiting for an event or other conditions
* the most basic mechanism for waiting for an event to be triggered by another thread is the condition variable
	* conceptually, a condition variable is associated with some event or other condition
		* and one or more threads can wait for that condition to be satidfied.
		* when some thread has determined that the condition is satisfied, it can then notify one or more of the thread waiting on the condition variable.

#### 4.1.1 Wating for a condition with condition variable
###### std header condition_variable 
* provide:
	* std::condition_variable
	* std::condition_variable_any 
* they need work with a mutex in order to provide appropriate synchronization
	* std::condtion_variable is limited to working with std::mutex
	* std::condition_variable can work with anything that meets some minimal criteeria for being mutex-like

```
#include <condition_variable>
#include <mutex>
#include <queue>

class data_chunk;
std::mutex mut;
std::queue<data_chunk> data_queue;
std::condition_variable data_cond;
data_chunk prepare_data();

bool more_data_to_prepare();
void data_preparation_thread() {
    while (more_data_to_prepare()) {
        data_chunk const data = prepare_data();
        std::lock_guard<std::mutex> lk(mut);
        data_queue.push(data);
        data_cond.notify_one();
    }
}

void data_processing_thread() {
    while (true) {
        std::unique_lock<std::mutex> lk(mut);
        data_cond.wait(lk, []{ return !data_queue.empty()});
        data_chunk data = data_queue.front();
        data_queue.pop();
        lk.unlock();
        process(data);
        if (is_last_chunk(data))
            break;
    }
}
```

* using a queue to transfer data between threads is a common scenario

#### 4.1.2 Building a thread-safe queue with condition variables

* 被mutable修饰的数据成员，可以在const 成员函数中修改
	* 在const函数中用到 mutex.lock() 则需要把类成员的mutex声明为mutable std::mutex mtx; 

* if the waiting thread is going to wait only once, so when the condition is true it will never wait on this condition variable again, a condition variable might not be the best choice of synchronization mechanisms. This is especially true if the condition being waited for is the availability of a particular piece of data. In this scenario, a *future* might be more appropriate.

### 4.2 Waiting for one-off events with futures.
* header *future*
###### future类型：
* std::future&lt;&gt;
	* an instance of std:future is the one and only insttance that refers to its associated event
	* .get()只能调用一次，调用第二次就会throw std::future_error: No associated state.

* std::shared_future&lt;&gt;
	* multiple instance of std::shared_future may refer to the same event.
	* all instance will become ready at the same time, and they may all access any data associated with the event.

* the template parameter is the type of the associated data.
	* std::future&lt;void&gt; 、std::shared_future&lt;void&gt; indicate there is no associated data.

* if multiple threads need to access a single future object, they must protect access via a mutext or other synchronization mechanism.
	* but multiple threads may each access their own copy of a std::shared_future&lt;&gt; without future synchronization, even if they all refer to the same asynchronous result.
* the most basic of one-off events is the result of a calculation that has been run in the background.

#### 4.2.1 Returning values from background tasks
* use std::async to start an asynchronous task for which you do not need the result right away.
	* when you need the value ,you just call get() on the future, and the thread blocks until the future is reawdy and then returns the value.
	* if arguments are rvalues, the copies are created by moving the originals. This allows the use of move-only types as both the function object and the arguments.

```
T f1(arg1, arg2);
std::future&lt;T&gt; fu = std::async(f1, arg1, arg2);


class A {
public:
    int f2(arg1, arg2);
    int operator() (double);
};
A a;
std::future&lt;T&gt; fu = std::async(&amp;A::f2, &a, arg1, arg2);
std::future&lt;T&gt; fu = std::async(&amp;A::f2, ref(a), arg1, arg2);
// call temp_a.f2(arg1, arg2);
std::future&lt;T&gt; fu = std::async(&amp;A::f2, a, arg1, arg2);

// call tmp_a(1.0) where temp_a is move-constructed from A();
std::future&lt;T&gt; fu = std::async(A(), 1.0);
// call a(1.0)
std::future&lt;T&gt; fu = std::async(ref(a), 1.0);

```

* std::async额外参数
	* 类型是std::launch
	* std::launch::deferred
		* indicate the function call is to be deferred until either wait() or get() is called on the future.  
	* std::launch::async:
		* indicate the function must be run on its own thread 
	* default is: std::launch::deferred | std::launch::async
		* indicate the implementation may choose.

```
...async(std::launch::async, A(), 1.0);
```

#### 4.2.2 Associating a task with a future
* std::packaged_task&lt;T&gt; 
	* ties a future to a function or callable object
	* when invoked, it calls the associated function or callable object and makes the future read, with the return value stored as the associated data.
	*  T 是 函数签名
		* void() -> 无参无返回值函数
		* int(std::string&, double&)
	* 构建时，必须传入 函数或callable object
		* 传入的这个 的签名不需要和T完全一致，只需要可自动转换即可。
	* packaged_task&it;&gt;被弄成callable object, 实现了operator()。内含std::future&it;T1&gt;, void operator(arg1...);
		* std::packaged_task&it;T1(arg1...)&gt; 
		* can be wraped in a std::function object;
		* can be passed to a std::thread
		* can be invoked directly     
* 用于：
	* used as building block for thread pools or other task managent schemes
		* running each task on its own thread
		* running tasks all sequentially on a particular background thread.
	* if a large operation can be divided into self-contained sub-tasks, each of these can be wrapped in a std::packaged_task&lt;&gt; instance, and then that instance passed to the task scheduler or thread pool.
		* this abstracts out the details of the tasks
		* the scheduler just deals with std::packages_task&lt;&gt; rather than individual function. 

##### 4.2.1 Passing tasks between threads
* Many GUI frameworks require that updates to the GUI be done from specific threads, so if another thread need to update the GUI, it must send a message to the right thread in order to do so.
	* std::packaged_task provides one way of doing this without requiring a custom mesasge ... 

```
// Running code on a GUI thread using std::packaged_task
#include <deque>
#include <mutex>
#include <future>
#include <thread>
#include <utility>

std::mutex m;
std::deque<std::packaged_task<void()>> tasks;
bool gui_shutdown_message_received();
void get_and_process_gui_message();

void gui_thread() {
    while (!gui_shutdown_message_received()) {
        get_and_process_gui_message();
        std::packaged_task<void()> task;
        {
            std::lock_guard<std::mutex> lk(m);
            if (tasks.empty())
                continue;
            task = std::move(tasks.front());
            tasks.pop_front();
        }
        task();
    }
}
std::thread gui_bg_thread(gui_thread);

template<typename Func>
std::future<void> post_task_for_gui_thread(Func f) {
    std::packaged_task<void()> task(f);
    std::future<void> res = task.get_future();
    std::lock_guard<std::mutex> lk(m);
    tasks.push_back(std::move(task));
    return res;
}
```

#### 4.2.3 Making std::promises
* 用于：
	* those tasks can not be expressed as a simple function call
	* those tasks where the result may come from more than one place

* std::promise&lt;T&gt; provides a means of setting a value, which can laer be rad through an associated std::future&lt;T&gt; object;

#### 4.2.4 Saving an exception for the future

```
future<int> fu = async(f1, -1);
// 若f1(-1) throw exception
// 则fu.get() rethrow exception
fu.get()
```

* rethrow的exption可能是原先的,也可能是个copy，视编译器实现
* 在package_task、promise也是一样
* prmise可set_exception()

```
eg1:
try {
    pro.set_value(calculate_value());
} catch(...) {
    pro.set_exception(std::current_exception);
}

eg2: 
// 一直错误则显示设置
// 代码更简洁、有利于编译器优化
pro.set_exception(std::copy_exception(std::logic_error("foo ")))
```

* 若
	* future直到销毁时还没set
	* packaged_task直到销毁时还没调用过
	* 则future.get()得到
		* std::future _error类, error code为
			* std::future_errc::broken_promise &lt;- 对于future

#### 4.2.5 Waiting from multiple threads
* 多个线程等一个事件用 std::shared_future
* std::future is only *moveable*
* std::shared_future are copyable
	* can have multiple objects referring to the same associated state
	* 可以多次get()。
		* 同一线程多次get()
		* 不同线程多次get() 

* std::shared_future's member functions are unsynchronized, to avoid data races when accessing a single object from multithread:
	* 最佳实践是 复制shared_future

* 用处：
	* 电子表格中，先并行计算【值不依赖于其他格子】的单元格，再计算【值依赖其他格子】的单元格。可以用shared_future reference 第一种单元格

* 由std::future初始化std::shared_future
	* 用右值初始化则不需要显示move 

```
std::shared_future<int> sf (std::move(fu));
std::shared_future<int> sf(promise_a.get_future());
auto shared_fu = promise_a.get_future().share();
```

### 4.3 Waiting with a time limit
* 场景
	* 对一段代码执行多久有时间限制
	* 事件尚未发生前，有其他事情要做
* 两种等待
	* 等一段时间
		* _for 后缀  
	* 等到某个时间点之前
		* _until 后缀
	* 这两种等待，都可因为事件发生而提前醒来。

#### 4.3.1 Clocks    
* now() : return current time of a clock.
	* std::chrono::system_clock::now() -&gt;return current time of the system clock.
* time_point : the type of the time points.
	* the type of clock_a::now() is clock_a::time_point

* tick period of clock is a fractionnal number of seconds
	* 若只有跑起来才知道，则peroid可能是平均值/最小值/其他，视写库的人而定。

* steady clock
	* clock tikcks at a uniform rate, whether or not matches the period, and can not be adjusted
	* the *is_steady* static data member of the clock class is true if the clock is steady.
	* std::chrono::system_clock 不是 steady clock
		* 可能导致now()返回一个 比之前返回的 更早的。
		* represents the 'real time' clock of the system
		* provides functions for converting its time points to and from time_t values
	* std::chrono::steady_clock is steady clock 

* std::chrono::hight_resolution_clock
	* provides the smallest possible tick period ( and thus the highest possible resolution) of all the library-supplied clocks

#### 4.3.2 Durations
* std::chrono::duration&lt;T1,T2&gt;
	* T1 : such as int, long, double
	* T2 : std::ratio&lt;x1,x2&gt;
		* x1 / x2 = 【how many seconds each unit of the duration represents】 / 1s
	* 举例：
		* a number of minutes sotred in a *short* is std::chrono::duration&lt;short, std::ratio&lt;60, 1&gt;&gt;     
		* a number of milliseconds stored in a double is std::chrono::duration&lt;double, std::ratio&lt;1,1000&gt;&gt;;

###### predefined:
* nanoseconds
* microseconds
* milliseconds
* seconds
* minutes
* hours

###### 转换
* 在不需要截断时，duration之间可隐式转换
	* 可hours到seconds
	* 不可seconds到hours
* 强制转换 std::chrono::duration_cast&lt;&gt; 
	* 结果是截断的

###### 算数运算
* 可加减乘除
* .count()返回count of the number of units in the duration

```
std::chrono::millliseconds(1234).count() -> 1234
```

###### std::future相关
* wait functions all return a status indeicate 【wait timed out】/【waited for event occurred】
	* std::future_status::timeout
	* std::future_status::ready
	* std::future_status::deferred
* 可能因为 系统调度、os时钟精度变化、 之类的原因导致时间长的多 

```
std::future<int> f = std::async(f1);
if (f.wait_for(std::chrono::milliseconds(35)) == std::future_status::ready) {

}
```

#### 4.3.3 Time points
###### std::chrono::time_point&lt;&gt;
* std::chrono::time_point&lt;T1,T2&gt;
	* T1: the clock it refers to
	* the unitf of measurement(属于std::chrono::duration&lt;&gt;)
* .time_since_epoch() 
	* 返回duration
* 例子:
	* std::chrono::time_point&lt;std::chrono::system_clock, std::chrono::minutes&gt;  
* 运算
	* 与duration
		* std::chrono::hight_resolution_clock::now() + std::chrono::nanoseconds(500)  
	* 与time_point
		* 和clock相同的time_point相减返回duraion  

```
    auto start = std::chrono::high_resolution_clock::now();
    f1(1);
    auto end = std::chrono::high_resolution_clock::now();
    auto x = std::chrono::duration_cast<std::chrono::seconds>(end-start).count();
    
```

```
condition_variable cv;
bool done;
mutex m;
bool wait_loop() {
    auto const timeout = std::chrono::steady_clock::now() + std::chrono::milliseconds(500);
    unique_lock<mutex> lk(m);
    // you'ke
    while (!done) {
        if (cv.wait_until(lk, timeout) == cv_status::timeout) {
            break;
        }
    }
    return done;
}
```

#### 4.3.4 Functions that accepts timeouts
* std::this_thread::sleep_for()
* std::this_thread::sleep_until()
* std::condition_variable
* std::condition_variable_any
* std::future
* std::shared_future
* 一些互斥体
	* std::timed_mutex
	* std::recursive_timed_mutex
	* 都支持
		* try_lock_for()
		* try_lock_until()
* std::unique_lock&lt;TimedLockable&gt; 

### 4.4 Using synchronization of operations to simplify code
* One way this can help simplify your code is that it accommodates a muchmore functional (in the sense of functional programming) approach to programming concurrency. Rather than sharing data directly between threads, each task can be provided with the data it needs, and the result can be disseminated to any other threadsthat need it through the use of futures.

#### 4.4.1 Functional programming with futures
* functional programming指结果仅取决于参数，而不依赖任何外部状态
* pure function不改任何外部状态
* pure function适合多线程
	* 因为没了shared data，不再需mutex
	* Haskell默认函数都是pure function

* c++11也适合pure function式编程
	* lambda
	* std::bind
	* automatic type deduction for variables
	* future
		* future can be passed around between threads to allow the result of one computaions to depend on the result of another, without any explicit access to shared data.

###### FP-style quicksort

#### 4.4.2 Synchronization operations with message passing
* CSP: Communicating Sequential Processes
	* 线程间无shared data, 仅可通过channel通讯
		* 通过mesage queue也可
		* 每个线程等待消息，接着处理等到的消息。
	* 亦称为   


### 4.5 Summary

## Chapter5 The C++ momory model and operations on atomic types
* covers:
	* the details of the c++ 11 memory model
	* the atomic types provides by the c++
	* the operations that are available on those atomic types
	* how those operations can be used to provide synchronization between threads

* c++的atomic types、atomic operations提供底层同步设施。这些东西被转换为仅仅一两条指令。

### 5.1 Memory model basics
#### 5.1.1 Objects and memory locations
* important things:
	* every variable is an object, including those that are members of other objects.
	* every object occupies at least one memory location.
	* Variables of fundamental type are exactly one memory location, whatever their size, even if ther are adjacent or part of an array.
	* Adjacent bit fields are part of the same memory location

#### 5.1.2 Objects, memory locations, and concurrency
* 为了避免竞争：
	* 用mutex
	* use the synchronization properties of *atomic* operations

#### 5.1.3 Modification orders

### 5.2 Atomic operations and types in c++
#### 5.2.1 The standard atomic types
* &lt;atomic&gt;
* .is_lock_free()
	* return true. operations are done with atomic instructions
	* return false. 内部用了锁。
	* 只有std::atomic_flag没有这个函数，std::atomic_flag: 
		* bool原子类型
		* 是lock free
		* 可用来实现简单的锁，并进而实现所有其他原子类型
		* .test_and_set()，返回当前值，接着让值为true。
		* .clear()，让值为false
		* 再不可复制、不可赋值、不可test_and_clear、再没有其他操作了

* std::atomic&lt;&gt;
	* 在最流行的平台上，内置类型的原子类型很有可能是lock-free，但不要求
	* not in converntional sense :
		* copyable, no copy constructors, no copy assignment operators
		* assignable
	* 成员函数 
		* load()
		* store()
		* exchange()
		* compare_exchange_weak()
		* compare_exchange_string()
		*  +=, -=, *=, |=
		*  integral types and 【std::aotmic&lt;&gt; specializations for pointers】 支持 ++, --
		*  fetch_add
		*  fetch_or
		*  ...
	* 赋值运算符和成员函数 返回 改后/改前的值。
		* 避免改后 与 读之间值被修改。
	* 也可用于user-defined类型。但操作仅限于load()、store()、exchange()、compare_exchange_weak()、compare_exchange_string()
	* each of the operations on the atomic types has an optional memory-ordering argument。
	* 操纵分为三类
		* store -&gt; memory_order_relaxed, memory_order_release, memory_order_sqt_cst
		* Load
		* Read-modify-write
		* 默认memory-ordering为memory_order_sqt_cst
	

* 其他名称
	* atomic_bool
	* atomic_char
	* atomic_schar
	* atomic_uchar
	* ...
	* 其他名称的命名模式: 
		* a standard typedef T、built-in types T(但signed缩写为s, unsigned缩写为u, long long缩写为llong)
			* atomic type: atomic_T
	* 直接用std::atomi&lt;T&gt; 

#### 5.2.2 Operations on std::atomic_flag
* its basic and is intendes as a building block only.
* 一般都用不到
* 必须初始化成这样： std::atomic_flag f = ATOMIC_FLAG_INIT;
* 一定是lock-free
* 成员函数
	* clear()
	* test_and_set() 

```
class spinlock_mutex {
    std::atomic_flag flag;
public:
    spinlock_mutex() : flag(ATOMIC_FLAG_INIT) {}
    void lock() {
        while (flag.test_and_set(std::memory_order_acquire));
    }
    void unlock()  {
        flag.clear(std::memory_order_release);
    }
};
```


#### 5.2.3 Operations on std::atomic&lt;bool&gt;
* 可以bool值构造，可赋值bool值

```
std::atomic<bool> b(true);
b = false; // 这个表达式返回bool值，而不是b的引用
```

* 原子类型的赋值表达式返回值而不是 变量的引用

```
如这种，那么就是括号里是返回false。
if (b=false) {

}
若返回引用，则还需要从b中读一下，从而导致有可能读出的是【被其他线程再次修改后的b】
```

* void store(new_value)
* .exchange(new_value) -&gt; 返回修改前的值
* bool load(optional_memory_order)

###### storing a new value or not depending on the current value
* bool compare_exchange_weak(T&amp; arg1, val)
* bool compare_exchange_strong()
* 若当前值=arg1，则【当前值=val】，否则【arg1=当前值】
* 若当前遍历被替换，则true,否则false
* compare_exchange_weak()可能 假的不等
	* 因机器缺少相应的原子操作
	* 因此可能的情况是：返回false, 且 当前值和arg1都没被替换
	* it must typically be used in a loop

```
bool expected = false;
extern atomic<bool> b;
while(!b.compare_exchange_weak(expected, true) &amp;&amp; ！expected);
```

* compare_exchange_strong()
	* 返回false时，一定是因为不等

* 若不论如何都想改变当前遍历(新值可能与当前值有关)
	* 计算新的值耗时，则用compare_exchange_string()更好
	* 计算新的值不耗时，用_weak()更好。因为_weak()可能假fail

* success、fail时的memory order


#### 5.2.4 Operations on std::atomic&lt;T*&gt;
*  T* fetch_add()
	* 返回改之前的

```
Foo arr[5];
std::atomic<Foo*> p (arr);
Foo* x = p.fetch_add(2);
assert(x == arr);
```

* memory_orrder相关

#### 5.2.5 Operations on standard atomic integral types
* 不支持乘除
* typically used either as counters or as bitmasks
* ++x, --x : 返回T，而非atomic&lt;T&gt;

#### 5.2.6 Operations on std::atomic&lt;UDT&gt; primary class template
* UDT条件
	* must have trivial copy-assignment operator
		* no virtual functions
		* no virtual base classes 
		* use cimpiler-genrated copy-assignment operator
	* every base class and non-static data member of a user-defined type must have trivial copy-assignmnet operator
	* must bitwise equality comparable 

#### 5.2.7 Free functions for atomic operations
* 原子类型的那些成员函数，有对应的c语言风格的函数
* std::shared_ptr&lt;&gt;也有原子操作
	* _explicit操作
	* std::atomic_is_lock_free() 
	
```
std::shared_ptr<my_data> p; 
std::shared_ptr<my_data> local = std::atomic_load(&0) 
std::atomic_store(&p, local);
```
### 5.3 Synchronizing operations and enforcing ordering
#### 5.3.1 The synchronizes-with relationship
* only get between operations on atomic types

#### 5.3.2 The happens-before relationship
#### 5.3.3 Memory ordering for atomic operations
* 六种
	* memory_order_relaxed
	* memory_order_consume
	* memory_order_acquire
	* memory_order_release
	* memory_order_acq_rel
	* memory_order_seq_cst &lt;- 默认

* 六种分为三类
	* 1 sequentially consistent ordering
		* memory_order_seq_cst
	* 2 acquire-release ordering
		* memory_order_consume
		* memory_order_acquire
		* memory_order_release
		* memory_order_acq_rel
	* 3 relaxed ordering
		* memory_order_relaxed  

	* 常常1'cost > 2'cost > 3'cost，但x86平台差别小

###### 1 sequentially consistent ordering
###### 2 non-sequential consistent memory orderings
###### 3 relaxed ordering
* the modification order of each var is the only thing shared between threads that are using memory_order_relaxed
###### 4 understanding relaxed ordering
* 把relaxed的原子变量比作 小屋子里拿着本子记录变量值的人

###### 5 acquire-release ordering
* load are memory_order_acquire
* store are memory_order_release
* read-modify-write are memory_order_acquire/memory_order_release/memory_order_acq_rel
* acquire、realease作为一对 互相同步
	* A release operation synchronizees-with an acquire operation that reads the value written 
* cost比sequential-consistent低得多

##### 6 data dependency with acquire-release ordering and memory_order_consume 
###### memory_order_consum与data dependency有关
* 两种关系
	* dependency-ordered-before
	* carries-a-dependency-to
* 用法
	* 1 load() with memory_order_consume
	* 2 store() with memory_order_release/memory_order_acq_rel/memory_order_seq_cst
	* 2 is dependency-ordered-before 1

* kill_dependency(arg1)  

###### 6 transitive synchronization with acquire-release ordering

#### 5.3.4 Release sequences and synchronizes-with
#### 5.3.5 Fences
* also called *memory barrieres*
* Fences is additional ordering constraints
* Fences are operations that enforce memory-ordering constraints without modifying any data and typically combined with atomic operations that use memory_order_relaxed.
* are global oeprations and affect the ordering of other atomic operations in the thread that executed the fence.
* relaxed operations on separate variables can usually be freely reordered by the compiler or the hardware, Fences restrict this freedom and introduce happens-before and synchronizes-with relationships that were not presenet before.
* you need a release in one thread and an acquire in another to get a synchronizes-with relationship
* general ideas:
	* acquire operations sees 【release fence后的store()】, 则 release operation synchronizes-with the axquire fence.
	* 【发生在acquire fence前】的load() sees 【release operations的结果】，则release operation synchronizes-with the acquire fence.
	* 上诉两个结合起来：【发生在acquire fence前】load() sees 【release fence前的】store(), 则 release fence synchronizes-with the acquire fence.


#### 5.3.6 Ordering nonatomic operation with atomics
* the real benefit to using atomic operations to enforce an ordering it that they can enforce an ordering on nonatomic operations and thus avoid the undefined behavior of a data race.

### 5.4 Summary

## chap6 Designing lock-based concurrent data structures
* covers:
	* what it means to design data structures for concurrency
	* Guidelines for doing so
	* Example implementations of data structures designed for concurrency.

###### 并发使用的数据结构
* use a seperate mutex and external locking
* design the data structure itself for concurrent accesss

### 6.1 what does it mean to design for concurrency
* *thread-safe*:
	* 可多线程同时访问此数据结构，这些线程采用相同/不同的操作。
	* each thread will see a self-consistent view of the data structure
	* no data will be lost or corrupted
	* all invariants will be upheld
	* there is no problematic race condtions

* Truly designing for concurrency means: providing the opportunity for concurrency to threads acessing the data structure
* *serialization*: threads take turns accessing the
data protected by the mutex; they must access it serially rather than concurrently.
* the smaller the protected region, the fewer operations are serialized, and the greater the potential for concurrency.

#### 6.1.1 Guidelines for designing data structures for concurrency
* 两方面考虑：
	* ensuring the accesses are safe
		* take care to avoid race conditions inherent in the interface to the data structure by providing functions for complete operations rather than for operation steps.
		* pay attention to exceptions
		* and so on 
		* 构造/析构函数 常不可并行访问。但需使用者确保构造完成前/析构开始后 不被访问。
		* 若支持 赋值/swap()/【copy construction】， you need to decide whether these operations are safe to call concurrently with other operations or whether they require the user to ensure exclusive。
	* enabling genuine concurrent access 
		* can the scope of locks can be restricted to allow some parts of an operation to be performed outside the lock. 
		* can different parts of the data stucture be protected with different mutexes.
		* do all operations require the same level of protection
		* can a simple change to the data structure imporve the opportunities for concurrency without affecting the operational semantics

### 6.2 Lock-based concurrent data structures
#### 6.2.1 A thread-safe stack using locks
* locking a mutex may throw exception, but it's exceedingly rare.
* unlocking a mutex can not fail.
* 自己写模板类class A&lt;T&gt;时，对T有std::move(T)、new等操作f1()，
	* 如果T是user-defined, 若T的这些操作又调用了当前这个A&lt;T&gt;的方法f2()，若f2()、f1()都用了类示例的那把大锁，则就可能死锁。但这些可以不管、交给调用者自行确保，因为调用者连这点都不确保压根就是在乱整。
* 调用者自行确保【构建结束前】/【析构开始后】，类实例不被调用。 

#### 6.2.2 A threa-safe queue using locks and condition variables
* 使用了std的内置容器
* by using the standard container you now have essentially one data item that is either protected or not . By taking control of the detailed implementation of the data structure, you can provide more fine-grained locking and thus allow a higher level of concurrency.

#### 6.2.3 A thread-safe queue using fine-grained locks and condition variables.
* in order to use finer-grained locking, you need to look inside the queue at its constituent parts and associate one mutex with each distinct data item.

###### 1 enabling concurrency by separating data
* 如让某个成员函数，只涉及对一个成员变量的使用。

###### 2 waiting for an item to pop

###### other
* once a bounded queue is full, push will either fail or block, until an element has been popped from the queue to make room.
* bounded queue can be useful. prevents thread(s) pupulating the queue from running too far ahead of the thread(s) reading items from the queue.

### 6.3 Designing more complex lock-based data structures

#### 6.3.1 Writing a thread-safe lookup table using locks
* 若键对应的值不存在
	* 让提供默认值
	* 返回&lt;T,bool&gt;,
	* 返回智能指针
* boost::shared_mutex
	* 支持多线程读/一个线程写

* size_t std::hash&lt;T&gt;() ;
	* size_t -&gt; unsigned integral type	 
	* hash tables work best with a prime number of buckets

* hash table内含 类bucket 数组。每个bucket实例都内置锁。

#### 6.3.2 Writing a thread-safe list using locks
* iterator会引用内部数据，不好。
* 提供iteration function
	* leave it up to the user to ensure that they do not cause deadlock by acquiring locks in the user-supplied operations and do not cause data races by storing the references for access outside the locks. 

* 每个节点一把锁，虽然节点多时锁也多
	* node::mutex 
	* 释放【当前节点锁】前，锁住下一节点。
* it is undefined behaviorto destory a locked mutex。
	* 通过智能指针销毁node时，需要先unlock那个节点的锁。
		* 因为lock了前一个节点，所以【unlock后销毁下一个节点】是安全的

### 6.4 summary

## chap7 Deisigning lock-free concurrent data structures
* covers:
	* implementations of data stuctures designed for concurrency without locks.
	* techniques for managing memory in lock-free data stuctures
	* simple guidelines to aid in the writing of lock-free data structures.

### 7.1 Definitions and consequences
* blocking data structures and algorithms:
	* that use mutexes, condition variables, and futures to synchronize the data. 
* blocking call:
	* call function that will suspend the execution of a thread until another thread performs an action.   
* nonblocking data structures and algorithms
	* that do not use blocking call
	* not all are lock-free   

#### 7.1.1 types of nonblocking data stuctures
* 前文的那个用atomic_flag实现的自旋锁
	* noblocking
	* not lock-free
	* 只能被一个线程用。多个线程用就可能data race 

#### 7.1.2 lock-free data stuctures
###### require:
* more than one thread must be able to access the data structure concurrently
* if one of the threads accessing the data stucture is suspended by the scheduler midway throught its operation, the other threads must still be able to complete their operations without waiting for the suspended thread
###### eg
* lock-free队列可同时push()、pop(), but break if two threads try to push new items at the same time.
###### other
* algorithms that use compare/exchange operations on the data structure often have loops in them
	* the reason for using a compare/exchange operation is that another thread might have modified the data in the meantime, in which case the code will need redo part of its operation before trying the compare/exchange again.


#### 7.1.3 wait-free data stuctures
* a wait-free data structure is a lock-free data stucture and 【can complete its operation within a bounded number of steps, regardless of the behavior of other threads】
	* can not have live lock 

#### 7.1.4 the pros and cons of lock-free data structures
* must use atomic operations for the modifications
* *live lock* occurs when two threads each try to change the data stucture, but for each thread the changes make by the other require  the operation to be restarted.
	* sap performance rather than cause long-term problems 

###### downside:
* may well decrease overall performance 
	* atomic operations can be mush slower than nonatomic operations. 
		* 【lock-free数据结构里的原子操作】可能比【基于锁的数据结构里的 原子操作】多
		* the cache ping-pong associated with multiple threads accessing the same atomic var can be a significant performance drain.

###### 其他
* it is important to check the performance aspects between 【lock-based data stucture】 and 【lock-free data structure】:
	* 最差等待时间
	* 平均等待时间
	* 总执行时间
	* 其他 

### 7.2 Examples of lock-free data structures
* 这些例子，先直接用默认的memory_order_seq_cst，接着reducing constraints to other memory order.
* only std::atomic_flag is guaranted not to use locks in the implementation. 因此一些平台上，看似没用锁而只用原子操作的代码，其实标准库的实现用了锁。
	* 在上述那些平台，直接用锁或许更合适。 

#### 7.2.1 thread-safe stack without locks
* push()和pop()都用到

```
while(!x.compare_exchange_weak(y, z);
/*
x = y 是invariant。这种关系应该一直保持，不应该在中间被其他线程破坏。
*/
// push():
// 在把new_node赋值给head过程中，head不能{被【其他线程改】导致【invariant:new_node->next = head】被破坏}。
// 在令head=new_node的过程种，head需要保持【head = new_node->next】
while(!head.compare_exchange(new_node->next, new_node);

// pop();
// 在令head = old_head-next, head需要保持【old_head = head】
while(!head.compare_exchange(old_head, old_head->next);
```

* 这个lock-free但不是wait-free

#### 7.2.2 stopping those pesky leaks: managing memory in lock-free data stuctures
* 需要写个垃圾回收器来回收被删掉的节点。
	* 因为返回的是【shared_ptr&lt;T&gt;,是data，而不是node】 ，而push时node是new出来的。
* 当没线程pop()时，就真正销毁【应该被销毁的节点】
	* 用atomic counter统计【正在pop()的线程数】 

#### 7.2.3 detecting nodes that can not be reclaimed using hazard pointers
* 在持续高并发场景下，7.2.2的垃圾回收器可能无效
* the key is to identify when no more threads are accessing a particular node. By far the easiest such mechanism to reason about is the use of hazard pointers.

###### 使用harzard pointer大法
* using this relies on the fact: it is safe to【 use the value of a pointer adter the object it references have been deleted】^1.
	* 但【new、delete的默认实现】使1是undefined behavior。因此需要【自己重新实现new、delete】/【检查默认实现是否可以】

###### 其他
* atomic operations are of ten 100 times slower than an equivalent nonatomic operation on desktiop CPUs.

#### 7.2.4 detecting nodes in use with reference counting 
* 这节中，作者做了以下操作：
	* 1直接用shared_ptr&lt;T&gt;来管理node
	* 2嫌弃1中用到的shared_ptr可能不lock-free,就自己用原子操作count来模拟shared_ptr的机制。接着说2中用到的atomic的操作可能也不是lock-free的

#### 7.2.5 applying the memory model to the lock-free stack
* 之前的原子操作都是用的默认的memory_order, 这节中弄一些更relax的原子操作


#### 7.2.6 writing a thread-safe queue without locks

### 7.3 Guidelines for writing lock-free data structures
#### 7.3.1 use std::memory_order_seq_cst for prototyping
* 先用memory_order_seq_cst写好。再看看有哪些地方可以通过更relax的memory order来优化。

#### 7.3.2 use a lock-free memory reclamation scheme

#### 7.3.3 watch out for the ABA problem
#### 7.3.4 identify busy-wait loops and help the other thread

### 7.4 summary

## chap8 designing concurrent code
* covers:
	* techniques for dividing data between threads
	* factors that affect the performance of concurrent code
	* how performance factors affect the design of data structures
	* exception safely in multithreaded code
	* scalability
	* example implementations of several parallel algorithms

### 8.1 Techniques for dividing work between threads
#### 8.1.1 dividing data between threads before processing begins
#### 8.1.2 dividing data recursively
#### 8.1.3 dividing work by type
##### 8.1.3.1 dividing work by task type to separate concerns
* 若一些线程间通讯太多，那么可能不如用一个线程

##### 8.1.3.2 dividing a sequence of tasks between threads
* 类似流水线。若需要对数据们做一系列操作，则一个线程一步，接着把处理过的数据放入队列给下一个线程。
	* 适用于输入是流，而不是所有数据的情况。

* 可以让处理更流畅。
	* 如解码视频，若一个线程【完整的解压整个帧】 ，则可能某一秒，所有线程都一帧都没解压出来，接着下一秒所有都解压出来。但若流水线这种，则可以更加流畅。

### 8.2 Factors affecting the performance of concurrent code
#### 8.2.1 How many processors
* 通过std::thread::hardware_concurrency()返回核心数。
	* 但可能os也运行了其他io密集型的应用。
	* 有些async()的实现会根据os状况更好的决定是否并发执行
	* 查看os了解有没有机制帮助应用选择合适的并发数 

#### 8.2.2 data contention and cache ping-pong
* as the number of processors increases, so does the likelihood and performance impact of another problem: that of multiple processors trying to access the same data.
* *high contention*, *low contention*
* cache ping-pong:
	* the data will be passed back and forth between the caches many times
	* if a processor stalls beacuse it has to wait for a cache transfer, it can not do any work in the meantime,  even if there are other threads waiting that could do useful work.
* mutex在不同线程间被获取、锁住、解锁，这就造成了cache ping-pong
* 越多个的线程间有sharing data、mutex，high contention 就越有可能   

#### 8.2.3 False sharing
* cache line
	* blocks of memory, typically 32 or 64 bytes 
	* 在多个核之间共享
	* 当核上的线程需要修改cache line 1的数据时，cache line 1 的所有权就转到那个核。这就产生了cache ping-pong
* false sharing: the cache line is shared, but none of the data is.  

#### 8.2.4 How close is your data
* 若一个线程访问的数据在内存中位置差距大，则数据分散在多个cache line中，不如在更少的cache line时性能好

#### 8.2.5 oversubscription and excessive task switching
* oversubscription: 开的线程太多了，导致反而总体性能下降
*  Having the extra threads enables the application perform useful work rather than having processors sitting idle while the threads wait.	

### 8.3 Designing data structures for multithreaded performance
#### 8.3.1 diving array elements for complex operations
#### 8.3.2 data access patterns in other data structures
### 8.4 Additional considerations when designing for concurrency
### 8.5 Designing concurrent code in practice
### 8.6 summary

## chap9 advanced thread management
* covers:
	* thread pools
	* handling dependencies between pool tasks
	* work stealing for pool threads
	* interrupting threads 
  
  ### 9.1.1 the simplest possible thread pool
  * 有任务队列
  * 固定数量的线程
  * 若提交任务的需要等待任务完成，则需要自行同步。

```
class thread_pool {
    std::atomic_bool done;
    thread_safe_queue<std::function&lt;void()&gt;&gt; work_queue;
    std::vector&lt;std::thread&gt; threads;
    join_threads joiner;
    void worker_thread() {
        while (!done) {
            std::function&lt;void()&gt; task;
            if (work_queue.try_pop(task)) {
                task();
            } else {
                std::this_thread::yield();
            }
        }
    }
public:
    thread_pool () : done(false), joiner(threads) {
        unsigned const thread_count = std::thread::hardware_concurrency();
        try {
            for (unsigned i=0; i&lt;thread_count; ++i) {
                threads.push_back(
                        std::thread(&thread_pool::worker_thread, this));
            }
        } catch (...) {
            done = true;
            throw;
        }
    }
    ~thread_pool() {
        done = true;
    }
    template&lt;typename FunctionType&gt;
    void submit(FunctionType f) {
        work_queue.push(std::function&lt;void()&gt;(f));
    }
};
```















































## 附录A
### A1 rvalue reference
* 仅可用于rvalue,

```
// 声明
int && i = 42;
// i是rvalue reference,是lvalue,42是rvalue
f1(int && p) { p = 1; }
f1(move(i)); // => i变成1
// f1(i)会报错，因为i是rvalue reference,是lvalue
```

#### A1.1 move
* 用于 "steal" rvalue。被steal的可能变成默认构造器造出来的那种

```
// 这里vec是 rvalue reference.
// 传入参数本身是rvalue时，就不会copy
void f(vector<int> &&vec);

```

* 强制steal
	* std::static_cast<T&&>
	* std::move()

```
X x1;
X x2 = std::move(x1);
X x3 = std::static_cast<X&&>(x2);
```

* unique_ptr不可复制（不然跑起来出错），需用move

```
unique_ptr<int> p = make_unique<int>(1);
// 跑起来会出错：
// x vs.push_back(p); x
// 正解：
// vs.push_back(move(p));
// move之后p.get()=0,不可*p

// 
shared_ptr<int> p1 = make_shared<int>(1);
// use_count ++ :
shared_ptr<int> p2 = p1;  
shared_ptr<int> p2 (p1);
f1(p1); // f1(shared_ptr<int>)
vs.push_back(p2) // shared_ptr可复制

// use_count不变 ：
shared_ptr<int> p2 = move(p1); // => p1.get() = 0;

// 未初始化的shared_ptr p:
// use_count = 0; 等于nullptr; p.get()等于0

// 为初始化的unique_ptr p;
// 等于nullptr; p.get()等于0
```

* thread、unique_lock、future<>、promise<>、packaged_task<> 都不可复制，但都可move

#### A1.2 rvalue references and function templates

### A2 deleted functions
* 让类不被copy的方法

```
1. hack:
让copy constructor、copy assignment operator
变成private且不实现
这样别人复制时，就会 ce, 因为 没实现 

A(A const &) = delete;
A& operator= (A const &) = delete;
```

* 让类不能被copy，让其能被move

```
A(A&& _a) : a(move(_a)) {}
A& operator=(A&& _a) {
    a = move(a);
    return *this;
}
```

### A3 defaulted functions
* 声明函数，当编译器可以自动生成某函数的实现时，让编译器帮你写。
	* 没声明函数，而编译器自动生成此函数的 声明和实现时，编译器会让其public。通过这波操作，可以让protected/private
	* 没声明函数，而编译器会自动生成此函数的 声明和实现时。通过这波操作，读代码的人知道有这个函数。 
	* 没声明函数，而编译器不会自动生成此函数的 声明和实现时。如 默认构造器=defaulted,

```
class A {
    private:
        // change access
        A() = default;
    public:
        // take a noe-const reference
        A(A&) = default;
        // declare as defaulted for documentation
        A& operator=(const A&) = default;
    protected:
        // change access and add virtual
        virtual ~A() = default;
}
```

* 一些函数不为default,则这些函数必定不trivial。trivial的一些好处
	* Object若有default的 trival copy constructor， trival copy assignment operator, trival destructor,可以memcpy、memmove
	* Classes with trival copy assignment 可用于atomic<>,以便 原子操作

* constructor都不是 自己写的，则这个class can be initialized with an aggregate initializer

```
struct A {
    A() = default;
    A(A const &) = default;
    int a; 
    double b;
};
A x = {1,2.11};
```

#### A.4 constexpr functions
##### A.4.1 consexpr and user-defined types
* 是literal type的class，需要满足的条件：
	* 拥有trivial copy constructor, 拥有 trivial destructor
	* all non-static data members and base classes must be trivial types
	* must have either a trivial default constructor or a constexpr constructor other than the copy constructor.

* constexpr函数 只能 call其他 consexpr 函数
* constexpr函数 只能用于const expression

##### A.4.2 constexpr objects

##### A.4.3 constexpr function requiremetns
* 除开构造函数的 要求：
	* 参数 都是 literal type
	* 返回类型 是 literal type
	* 函数题 只有一个 return 
	* return语句是constant expression
	* 用于构造 返回值 的 任何 构造器、转换操作符 都需是constexpr

* constexpr的 成员函数 的额外要求：
	* 不能为virtual
	* 所属类 需是 literal type 

* constexpr的构造函数 的要求：
	* 构造函数体 为空
	* 基类 需初始化
	* non-static成员 需初始化
	* any constructor used in the member initialization list must qualify as constant expressions.

* trivial copy constructors are implicityly constexpr

##### A.4.4 constexpr and templates

#### A.5 Lambda functions
* 最简形式

```
// 无参数，无返回值，仅使用全局变量、函数
[] {
    do_stuff();
    do_more_tuff();
}();
```

* 例子

```
vector<int> data = make_data();
for_each(data.begin(),data,end(), [](int i) { cout<<i<<endl; });
```

* for_each(Iterator start, Iterator end, arg3)
	* arg3是 callable的（函数/实现()的类）
	* arg3以 遍历的元素 作为输入

```
vector<int> vs;
void f(int i) {
    cout<<i<<endl;
}
class A {
    void operator()(int i) {}
}
for_each(vs.begin(),vs.end(), f);

A a;
for_each(vs.begin(),vs.end(), a);
```

* lambda 返回值类型
	* 可由lambda表达式中的 单一return语句推断。
	* 显示指定。[](arg...)-> 返回值类型 {}

```
cond.wait(lk,[]()->bool{return myb;});
```

##### A.5.1 lambda functions that reference local variablers

* [=] // 捕获外部 所有局部变量 的 copy
* [&] // 捕获外部 所有局部变量 的 reference
* 捕获全部，默认copy,一些reference

```
// j、k为reference
int i=1,j=2,k=3;
[=, &j, &k]{return i+j+k;}();
```

   * 捕获全部，默认reference,一些copy

```
[&,j,k]{return i+j+k;}();
```

* 捕获部分

```
[&i, j, &k]{return i+j+k;}();
// 成员函数中的lambda
class A {
    int x;
    void foo(vector<int>& vs) {
        for_each(vs.begin(),vs.end(), [this](int & i){
        i += x;
        };
    }
}
```

* concurrency中的使用场景
	* std::condition_variable::wait()
	* std::packaged_task<>
	* thread pools for packagin small tasks
	* std::thread constructor
	* as the function when using parallel algorithms 
		* parallel_for_each() 

#### A.6 可变参数模板
#### A.7 自动判断变量类型
* if a variable is initialized in its declaration from a value of the same type, then i can specify the type as auto
* 知识
	* array types decay to pointers, references(引用) are dropped unless the type expression explicityly declares the variable as a reference 

```
int & r = x;
auto x = r; // int
auto& y = r; // int &
```

#### A.8 Thread-local variables