Error Hive (с Derby): исходные таблицы не могут быть пустыми

Я начинаю с Hive, и мне нужна ваша помощь в ошибке. Это происходит после времени ожидания, когда я пытаюсь создать новую базу данных:

 hive> CREATE DATABASE Test;

(редактировать: я получаю то же самое с "SHOW TABLES")

Я получаю это:

Exception in thread "main" java.lang.AssertionError: Source tables cannot be empty
    at org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyTables.<clinit>(EnforceReadOnlyTables.java:46)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.hadoop.hive.ql.hooks.HookUtils.getHooks(HookUtils.java:60)
    at org.apache.hadoop.hive.ql.Driver.getHooks(Driver.java:1612)
    at org.apache.hadoop.hive.ql.Driver.getHooks(Driver.java:1596)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1677)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

Вот мой куст-сайт.xml:

    <configuration>

<property>
  <name>hive.in.test</name>
    <value>true</value>
      <description>Internal marker for test. Used for masking env-dependent values</description>
      </property>

<!-- Hive Configuration can either be stored in this file or in the hadoop configuration files  -->
<!-- that are implied by Hadoop setup variables.                                                -->
<!-- Aside from Hadoop setup variables - this file is provided as a convenience so that Hive    -->
<!-- users do not have to edit hadoop configuration files (that may be managed as a centralized -->
<!-- resource).                                                                                 -->

<!-- Hive Execution Parameters -->
<property>
  <name>hadoop.tmp.dir</name>
    <value>${test.tmp.dir}/hadoop-tmp</value>
      <description>A base for other temporary directories.</description>
      </property>

<!--
     <property>
       <name>hive.exec.reducers.max</name>
         <value>1</value>
           <description>maximum number of reducers</description>
           </property>
           -->

<property>
  <name>hive.exec.scratchdir</name>
    <value>${test.tmp.dir}/scratchdir</value>
      <description>Scratch space for Hive jobs</description>
      </property>

<property>
  <name>hive.exec.local.scratchdir</name>
    <value>${test.tmp.dir}/localscratchdir/</value>
      <description>Local scratch space for Hive jobs</description>
      </property>

<property>
  <name>datanucleus.schema.autoCreateAll</name>
    <value>true</value>
    </property>

<property>
  <name>hive.metastore.schema.verification</name>
    <value>false</value>
    </property>

<property>
  <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
    </property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
    <value>org.apache.derby.jdbc.ClientDriver</value>
    </property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
    <value>APP</value>
    </property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
    <value>mine</value>
    </property>

<property>
  <!--  this should eventually be deprecated since the metastore should supply this -->
  <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
      <description></description>
      </property>

<property>
  <name>hive.metastore.metadb.dir</name>
    <value>file://${test.tmp.dir}/metadb/</value>
      <description>
        Required by metastore server or if the uris argument below is not supplied
          </description>
          </property>

<property>
  <name>test.log.dir</name>
    <value>${test.tmp.dir}/log/</value>
      <description></description>
      </property>

<property>
  <name>test.data.files</name>
    <value>${hive.root}/data/files</value>
      <description></description>
      </property>

<property>
  <name>test.data.scripts</name>
    <value>${hive.root}/data/scripts</value>
      <description></description>
      </property>

<property>
  <name>hive.jar.path</name>
    <value>${maven.local.repository}/org/apache/hive/hive-exec/${hive.version}/hive-exec-${hive.version}.jar</value>
      <description></description>
      </property>

<property>
  <name>hive.metastore.rawstore.impl</name>
    <value>org.apache.hadoop.hive.metastore.ObjectStore</value>
      <description>Name of the class that implements org.apache.hadoop.hive.metastore.rawstore interface. This class is used to store and retrieval of raw metadata objects such as table, database</description>
      </property>

<property>
  <name>hive.querylog.location</name>
    <value>${test.tmp.dir}/tmp</value>
      <description>Location of the structured hive logs</description>
      </property>

<property>
  <name>hive.exec.pre.hooks</name>
    <value>org.apache.hadoop.hive.ql.hooks.PreExecutePrinter, org.apache.hadoop.hive.ql.hooks.EnforceReadOnlyTables</value>
      <description>Pre Execute Hook for Tests</description>
      </property>

<property>
  <name>hive.exec.post.hooks</name>
    <value>org.apache.hadoop.hive.ql.hooks.PostExecutePrinter</value>
      <description>Post Execute Hook for Tests</description>
      </property>

<property>
  <name>hive.support.concurrency</name>
    <value>true</value>
      <description>Whether hive supports concurrency or not. A zookeeper instance must be up and running for the default hive lock manager to support read-write locks.</description>
      </property>

<property>
  <key>hive.unlock.numretries</key>
    <value>2</value>
      <description>The number of times you want to retry to do one unlock</description>
      </property>

<property>
  <key>hive.lock.sleep.between.retries</key>
    <value>2</value>
      <description>The sleep time (in seconds) between various retries</description>
      </property>


<property>
  <name>fs.pfile.impl</name>
    <value>org.apache.hadoop.fs.ProxyLocalFileSystem</value>
      <description>A proxy for local file system used for cross file system testing</description>
      </property>

<property>
  <name>hive.exec.mode.local.auto</name>
    <value>false</value>
      <description>
          Let hive determine whether to run in local mode automatically
              Disabling this for tests so that minimr is not affected
                </description>
                </property>

<property>
  <name>hive.auto.convert.join</name>
    <value>false</value>
      <description>Whether Hive enable the optimization about converting common join into mapjoin based on the input file size</description>
      </property>

<property>
  <name>hive.ignore.mapjoin.hint</name>
    <value>false</value>
      <description>Whether Hive ignores the mapjoin hint</description>
      </property>

<property>
  <name>hive.input.format</name>
    <value>org.apache.hadoop.hive.ql.io.CombineHiveInputFormat</value>
      <description>The default input format, if it is not specified, the system assigns it. It is set to HiveInputFormat for hadoop versions 17, 18 and 19, whereas it is set to CombineHiveInputFormat for hadoop 20. The user can always overwrite it - if there is a bug in CombineHiveInputFormat, it can always be manually set to HiveInputFormat. </description>
      </property>

<property>
  <name>hive.default.rcfile.serde</name>
    <value>org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe</value>
      <description>The default SerDe hive will use for the rcfile format</description>
      </property>

<property>
  <name>hive.stats.key.prefix.reserve.length</name>
    <value>0</value>
    </property>

<property>
  <name>hive.conf.restricted.list</name>
    <value>dummy.config.value</value>
      <description>Using dummy config value above because you cannot override config with empty value</description>
      </property>

<property>
  <name>hive.exec.submit.local.task.via.child</name>
    <value>false</value>
    </property>


<property>
  <name>hive.dummyparam.test.server.specific.config.override</name>
    <value>from.hive-site.xml</value>
      <description>Using dummy param to test server specific configuration</description>
      </property>

<property>
  <name>hive.dummyparam.test.server.specific.config.hivesite</name>
    <value>from.hive-site.xml</value>
      <description>Using dummy param to test server specific configuration</description>
      </property>

<property>
  <name>test.var.hiveconf.property</name>
    <value>${hive.exec.default.partition.name}</value>
      <description>Test hiveconf property substitution</description>
      </property>

<property>
  <name>test.property1</name>
    <value>value1</value>
      <description>Test property defined in hive-site.xml only</description>
      </property>

<property>
  <name>hive.test.dummystats.aggregator</name>
    <value>value2</value>
    </property>

<property>
  <name>hive.fetch.task.conversion</name>
    <value>minimal</value>
    </property>

<property>
  <name>hive.users.in.admin.role</name>
    <value>hive_admin_user</value>
    </property>

<property>
  <name>hive.security.authorization.manager</name>
    <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest</value>
      <description>The Hive client authorization manager class name.</description>
      </property>

<property>
  <name>hive.llap.io.cache.orc.size</name>
    <value>8388608</value>
    </property>

<property>
  <name>hive.llap.io.cache.orc.arena.size</name>
    <value>8388608</value>
    </property>

<property>
  <name>hive.llap.io.cache.orc.alloc.max</name>
    <value>2097152</value>
    </property>


<property>
  <name>hive.llap.io.cache.orc.alloc.min</name>
    <value>32768</value>
    </property>

<property>
  <name>hive.llap.cache.allow.synthetic.fileid</name>
    <value>true</value>
    </property>

<property>
  <name>hive.llap.io.use.lrfu</name>
    <value>true</value>
    </property>


<property>
  <name>hive.llap.io.allocator.direct</name>
    <value>false</value>
    </property>


<property>
  <name>hive.materializedview.rewriting</name>
    <value>true</value>
    </property>


</configuration>

Я более или менее выполнил следующие шаги: https://cwiki.apache.org/confluence/display/Hive/HiveDerbyServerMode. Что мне не хватает? Спасибо!


person Flibidi    schedule 18.05.2017    source источник
comment
Можете ли вы удалить свойство ниже из hive-site.xml . -- ‹property› ‹name›javax.jdo.option.ConnectionURL‹/name› ‹value›jdbc:derby://localhost:1527/metastore_db;create=true‹/value› ‹/property›   -  person SachinJ    schedule 18.05.2017
comment
Если я это сделаю, я получаю следующее: Исключение в потоке main java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: невозможно создать экземпляр org.apache.hadoop.hive. ql.metadata.SessionHiveMetaStoreClient   -  person Flibidi    schedule 19.05.2017
comment
Вы не можете использовать более 1 экземпляра, когда derby настроен на хранилище метаданных hive. Вам может потребоваться удалить файл блокировки, если предыдущий сеанс не закрылся должным образом. Пожалуйста, обратитесь к моему сообщению ниже - создать экземпляр org apache hadoop hive metastor"> stackoverflow.com/questions/22711364/   -  person SachinJ    schedule 21.05.2017


Ответы (1)


Я наконец нашел решение:

Мне пришлось запустить скрипт sql: $HIVE_HOME/scripts/metastore/upgrade/derby/hive-shema-x.x.x.derby.sql. Для этого я использовал ij.

Подробнее об этом: https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool http://db.apache.org/derby/papers/DerbyTut/ij_intro.html#Run+SQL+Scripts

person Flibidi    schedule 22.05.2017