Compatability
Can I drop in AsyncHBase in place of the HTable client?
Unfortunately, no. The APIs are fairly different, e.g. every call is asynchronous
in this client and you either have to wait for a response or use a callback chain
to process it.
Will this work with version X of HBase?
Probably. One of the nicest features of AsyncHBase is that you don't have to
upgrade the JAR every time you upgrade HBase. The client is compatible with
HBase from versions 0.92 to 0.94 and 0.96 to 1.3. Be sure to test the client
with HBase before upgrading in case something does change with the HBase protocols.
Can I use this with a secured HBase cluster?
As of version 1.7 you can connect to a cluster protected with Kerberos or simple
passwords.
Will AsyncHBase work with HBase from vendor X?
Again, probably. We've tested it with the vanilla HBase from Apache as well as
various versions of Cloudera and Hortonworks.
Can I connect to a cloud hosted HBase cluster?
Probably, as long as the hosted HBase isn't heavily customized with proprietary
security or other changes.
NOTE: Google's hosted Bigtable offering has an HBase compatible API layer
for porting applications. However under the hood, the Bigtable client is much
different than the HBase RPC implementation. Therefore AsyncHBase cannot be used
to write to Bigtable. Instead, use the native asynchronous APIs of the Bigtable
client.
What are some of the major differences between HTable and AsyncHBase?
- Only one instance of the HBaseClient is needed per HBase cluster.
- RPCs have "Request" appended to their name, e.g.
"Get" is now "GetRequest".
- Mutations like the PutRequest or AppendRequest are batched and
periodically flushed by default. That means if the JVM crashes, some
mutations may not have been committed to HBase.
- At this time AsyncHBase does not support co-processors.
- Some filters may not be supported (yet, help us add them!)
- Batch mutations are not, directly, supported yet.
- A number of exceptions thrown by the client are different.
- AsyncHBase has finer grained control over RPC behavior.
- ... um, AsyncHBase is asynchronous? (Though HTable is growing increasingly
asynchronous over the years and may eventually obviate the need for this client.
)
Meta
Why write AsyncHBase?
The HTable client included with HBase follow a synchronous model that makes
using it fairly straight forward but in doing so, makes inefficient use of
threads. While it's slowly including more asynchronous methods, the API
doesn't allow quite as fine-grain control over RPC behavior as AsyncHBase.
With fewer threads and using an event driven model, AsyncHBase can write and
read faster while using fewer resources.
How to contribute to AsyncHBase?
The easiest way is to fork
the project on GitHub.
Make whatever changes you want to your own fork, then send a
pull request.
You can also send your patches to the
mailing list.
Be prepared to go through a couple iterations as the code is being reviewed
before getting accepted in the main repository. If you are familiar with
how the Linux kernel is developed, then this is pretty similar.
Who commits to AsyncHBase?
Anyone can commit to AsyncHBase, provided that the changes are accepted
in the main repository after getting reviewed. If you have a few substantial
PRs that have been merged and would like to become a committer, just let us know.
Why does AsyncHBase use the LGPL?
One of the most frequent "holy war" that plague open-source communities is
that of what licenses to use, which ones are better or "more free" than others.
OpenTSDB uses the GNU LGPLv2.1+
for maximum compatibility with its dependencies and other licenses, and
because its author thinks that the LGPL strikes the right balance between the
goals of free software and the legal restrictions often present in corporate
environments.
Let's stress the following points:
- The LGPL is not the GPL. Although based on the same text, the
way it extends the GPL has significant consequences. Do not confuse the two.
- The LGPL is perfectly
compatible with Java. The myth that the LGPL does not work as intended
with Java is, well, just a myth, albeit a widespread one.
- The LGPL allows you to use the code in proprietary software, provided that
you don't redistribute a modified version of the LGPL'ed code.
- If you want to redistribute a modified version of the code, then your
changes must be released under the LGPL.
- The LGPL is perfectly compatible with the
ASF2 license.
Many people are misled to believe that there is an incompatibility because the
Apache Software Foundation (ASF) decided to not allow inclusion of LGPL'ed
code in its own projects. This choice only applies to the projects managed by
the ASF itself and doesn't stem from any license incompatibility.
With this out of the way, we hope that those afraid of the 3 letters "GPL"
will acknowledge the importance of using the LGPL in OpenTSDB and will
overcome their fear of the license.
Disclaimer: This page doesn't provide any formal legal advice.
Information given here is given in good faith. If you have any doubt, talk to
a lawyer first. In the text above "LGPL" or "GPL" refers to the version 2.1 of
the license, or (at your option) any later version.
Who supports AsyncHBase?
StumbleUpon supported the initial
development of AsyncHBase as well as its open-source release.
Yahoo is a heavy user and contributor to
AsyncHBase.
YourKit is kindly supporting open source projects with its full-featured Java
Profiler. YourKit, LLC is the creator of innovative and intelligent tools for
profiling Java and .NET applications. Take a look at YourKit's leading
software products:
YourKit Java Profiler
and
YourKit .NET Profiler.