Skip to the content.

InstEmb

Instruction-Aware Embedding introduce

Instruction-Aware Embedding aims to generate text embeddings conditioned on specific instructions, enabling embeddings to capture distinct semantic aspects based on different instructions. Ideally, embeddings generated under diverse instructions should highlight different semantic emphases of the same text, providing richer, more targeted semantic representations.

For example, in product retrieval scenarios, embeddings conditioned on queries such as “Is this product suitable for outdoor use?” and “Does this product have warranty coverage?” should ideally reflect different semantic aspects of the same product description.

Abstract

Recent advances have empowered large language models (LLMs) with remarkable fine-grained instruction-following capabilities in text generation tasks. However, embedding methods typically rely solely on the final token’s hidden state, limiting their ability to capture complementary semantic signals across multiple tokens. Moreover, existing discrete-to-continuous re-encoding approaches introduce semantic discontinuity. To address these limitations, we propose InstEmb, a novel instruction-aware embedding framework that jointly optimizes primary semantic information (via contrastive learning on the final token) and complementary semantic signals (through representation distillation with learnable soft prompts). Additionally, we introduce Dual-Anchor Alignment Pooling (DAAP), explicitly aligned with our dual training objectives. Extensive experiments demonstrate that InstEmb achieves state-of-the-art performance across multiple instruction-aware benchmarks, significantly outperforming strong baselines such as Inbedder, FollowIR, and Promptriever without benchmark-specific supervised data. Furthermore, InstEmb exhibits robust generalization to generic embedding tasks, despite using substantially fewer training examples. Ablation analyses reveal the effectiveness of our proposed distillation and contrastive strategies, highlighting the critical regularization role of soft prompts.

Motivation

Recently, large language models (LLMs) have significantly advanced in their ability to follow fine-grained instructions, achieving impressive zero-shot performance across diverse downstream tasks. However, fine-grained instruction adaptability remains challenging for LLM-based embedding methods. Existing embedding methods suffer from several limitations:

Thus, there is a clear need for an embedding method that can jointly optimize both primary and complementary semantic information, preserving semantic continuity and ensuring robust instruction adaptability.

Method

To explicitly optimize both primary semantic information and complementary semantic signals, we propose InstEmb, a novel instruction-aware embedding framework. InstEmb employs two main optimization strategies:

Additionally, we propose Dual-Anchor Alignment Pooling (DAAP), explicitly aligned with our dual training objectives. DAAP integrates two semantically complementary anchors:

The overall framework of InstEmb is illustrated below:

Link

Performance

InstEmb achieves state-of-the-art performance across multiple instruction-aware benchmarks and competitive results in generic embedding tasks, significantly outperforming strong baselines without benchmark-specific supervised training.

The main results are shown below:

Link

Application in work

InstEmb can be effectively applied to various practical embedding scenarios, such as:

In our work, InstEmb provides an efficient and powerful embedding solution, enabling fine-grained instruction adaptability and significantly improving embedding quality across diverse semantic tasks.

back