MINOR: Fixed broken links in the documentation

Author: Vahid Hashemian <vahidhashemian@us.ibm.com>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #2010 from vahidhashemian/doc/fix_hyperlinks
This commit is contained in:
Vahid Hashemian 2016-10-11 20:25:35 -07:00 committed by Jason Gustafson
parent 93b9400163
commit ae9532c6b3
3 changed files with 13 additions and 7 deletions

View File

@ -20,7 +20,7 @@ Kafka includes four core apis:
<li>The <a href="#producerapi">Producer</a> API allows applications to send streams of data to topics in the Kafka cluster.
<li>The <a href="#consumerapi">Consumer</a> API allows applications to read streams of data from topics in the Kafka cluster.
<li>The <a href="#streamsapi">Streams</a> API allows transforming streams of data from input topics to output topics.
<li>The <a href="#producerapi">Connect</a> API allows implementing connectors that continually pull from some source system or application into Kafka or push from Kafka into some sink system or application.
<li>The <a href="#connectapi">Connect</a> API allows implementing connectors that continually pull from some source system or application into Kafka or push from Kafka into some sink system or application.
</ol>
Kafka exposes all its functionality over a language independent protocol which has clients available in many programming languages. However only the Java clients are maintained as part of the main Kafka project, the others are available as independent open source projects. A list of non-Java clients is available <a href="https://cwiki.apache.org/confluence/display/KAFKA/Clients">here</a>.
@ -58,7 +58,7 @@ To use the consumer, you can use the following maven dependency:
&lt;/dependency&gt;
</pre>
<h3><a id="streamsapi" href="#streamsapi">Streams API</a></h3>
<h3><a id="streamsapi" href="#streamsapi">2.3 Streams API</a></h3>
The <a href="#streamsapi">Streams</a> API allows transforming streams of data from input topics to output topics.
<p>
@ -77,7 +77,7 @@ To use Kafka Streams you can use the following maven dependency:
&lt;/dependency&gt;
</pre>
<h3><a id="connectapi" href="#connectapi">Connect API</a></h3>
<h3><a id="connectapi" href="#connectapi">2.4 Connect API</a></h3>
The Connect API allows implementing connectors that continually pull from some source data system into Kafka or push from Kafka into some sink data system.
<p>
@ -86,7 +86,7 @@ Many users of Connect won't need to use this API directly, though, they can use
Those who want to implement custom connectors can see the <a href="/0100/javadoc/index.html?org/apache/kafka/connect" title="Kafka 0.10.0 Javadoc">javadoc</a>.
<p>
<h3><a id="legacyapis" href="#streamsapi">Legacy APIs</a></h3>
<h3><a id="legacyapis" href="#streamsapi">2.5 Legacy APIs</a></h3>
<p>
A more limited legacy producer and consumer api is also included in Kafka. These old Scala APIs are deprecated and only still available for compatibility purposes. Information on them can be found here <a href="/081/documentation.html#producerapi" title="Kafka 0.8.1 Docs">

View File

@ -327,8 +327,12 @@ makes a log more complete, ensuring log consistency during leader failure or cha
<p>
This majority vote approach has a very nice property: the latency is dependent on only the fastest servers. That is, if the replication factor is three, the latency is determined by the faster slave not the slower one.
<p>
There are a rich variety of algorithms in this family including ZooKeeper's <a href="http://www.stanford.edu/class/cs347/reading/zab.pdf">Zab</a>, <a href="https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf">Raft</a>,
and <a href="http://pmg.csail.mit.edu/papers/vr-revisited.pdf">Viewstamped Replication</a>. The most similar academic publication we are aware of to Kafka's actual implementation is <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=66814">PacificA</a> from Microsoft.
There are a rich variety of algorithms in this family including ZooKeeper's
<a href="http://web.archive.org/web/20140602093727/http://www.stanford.edu/class/cs347/reading/zab.pdf">Zab</a>,
<a href="https://ramcloud.stanford.edu/wiki/download/attachments/11370504/raft.pdf">Raft</a>,
and <a href="http://pmg.csail.mit.edu/papers/vr-revisited.pdf">Viewstamped Replication</a>.
The most similar academic publication we are aware of to Kafka's actual implementation is
<a href="http://research.microsoft.com/apps/pubs/default.aspx?id=66814">PacificA</a> from Microsoft.
<p>
The downside of majority vote is that it doesn't take many failures to leave you with no electable leaders. To tolerate one failure requires three copies of the data, and to tolerate two failures requires five copies
of the data. In our experience having only enough redundancy to tolerate a single failure is not enough for a practical system, but doing every write five times, with 5x the disk space requirements and 1/5th the

View File

@ -566,7 +566,9 @@ In general you don't need to do any low-level tuning of the filesystem, but in t
In Linux, data written to the filesystem is maintained in <a href="http://en.wikipedia.org/wiki/Page_cache">pagecache</a> until it must be written out to disk (due to an application-level fsync or the OS's own flush policy). The flushing of data is done by a set of background threads called pdflush (or in post 2.6.32 kernels "flusher threads").
<p>
Pdflush has a configurable policy that controls how much dirty data can be maintained in cache and for how long before it must be written back to disk. This policy is described <a href="http://www.westnet.com/~gsmith/content/linux-pdflush.htm">here</a>. When Pdflush cannot keep up with the rate of data being written it will eventually cause the writing process to block incurring latency in the writes to slow down the accumulation of data.
Pdflush has a configurable policy that controls how much dirty data can be maintained in cache and for how long before it must be written back to disk.
This policy is described <a href="http://web.archive.org/web/20160518040713/http://www.westnet.com/~gsmith/content/linux-pdflush.htm">here</a>.
When Pdflush cannot keep up with the rate of data being written it will eventually cause the writing process to block incurring latency in the writes to slow down the accumulation of data.
<p>
You can see the current state of OS memory usage by doing
<pre>