Advanced features
Use these methods to harden your Langfuse instrumentation, protect sensitive data, and adapt the SDKs to your specific environment.
Filtering by Instrumentation Scope
Langfuse now applies a default span filter in both SDKs to keep exports LLM-focused without extra configuration.
By default, a span is exported if any of these are true:
- It was created by the Langfuse SDK (
instrumentation_scope.name == "langfuse-sdk") - It has at least one
gen_ai.*attribute - It comes from a known LLM instrumentation scope (for example
openinference.*,langsmith,haystack,litellm,agent_framework,strands-agents,vllm,opentelemetry.instrumentation.anthropic)
If you want another integration added to the default instrumentation scope allowlist, open an issue in langfuse/langfuse with the scope name and a sample span.
You can inspect a span’s instrumentation scope in Langfuse under metadata.scope.name. Filtered-out spans do not appear in the UI.
To identify filtered scopes:
- Enable debug logging (
Langfuse(debug=True)orLANGFUSE_DEBUG="True"on Python,LANGFUSE_DEBUG="true"orLANGFUSE_LOG_LEVEL="DEBUG"on JS/TS). - Run your application and check logs for dropped-span messages and instrumentation scope names.
- Add those scopes to your allowlist logic by composing with
is_default_export_span/isDefaultExportSpan. - Optional: temporarily use
should_export_span=lambda span: TrueorshouldExportSpan: () => trueto inspect all spans, then restore filtering.
Earlier SDK versions exported all non-blocked spans by default. To restore that behavior, provide an always-true custom filter callback.
Filtering spans may break the parent-child relationships in your traces. For example, if you filter out a parent span but keep its children, you may see “orphaned” observations in the Langfuse UI or traces without any input or output.
Default behavior (recommended):
from langfuse import Langfuse
# Smart default filter (Langfuse + GenAI/LLM spans)
langfuse = Langfuse()Export everything:
from langfuse import Langfuse
langfuse = Langfuse(should_export_span=lambda span: True)Passing should_export_span replaces the default filter. To keep default behavior and extend it, compose with is_default_export_span.
Compose custom logic with built-in predicates:
from langfuse import Langfuse
from langfuse.span_filter import is_default_export_span
langfuse = Langfuse(
should_export_span=lambda span: (
is_default_export_span(span)
or (
span.instrumentation_scope is not None
and span.instrumentation_scope.name.startswith("my_framework")
)
)
)Only export spans created by the Langfuse SDK:
from langfuse import Langfuse
from langfuse.span_filter import is_langfuse_span
langfuse = Langfuse(should_export_span=is_langfuse_span)Available Python helpers: is_default_export_span, is_langfuse_span, is_genai_span, is_known_llm_instrumentor, KNOWN_LLM_INSTRUMENTATION_SCOPE_PREFIXES.
blocked_instrumentation_scopes still works for backward compatibility, but is deprecated and planned for removal in a future version. Prefer expressing deny rules in should_export_span.
Deprecated compatibility example:
from langfuse import Langfuse
langfuse = Langfuse(
should_export_span=lambda span: True,
blocked_instrumentation_scopes=["sqlalchemy", "psycopg"],
)You can read more about using Langfuse with an existing OpenTelemetry setup here.
Mask sensitive data
If your trace data (inputs, outputs, metadata) might contain sensitive information (PII, secrets), you can provide a mask function during client initialization. This function will be applied to all relevant data before it’s sent to Langfuse.
The mask function should accept data as a keyword argument and return the masked data. The returned data must be JSON-serializable.
from langfuse import Langfuse
import re
def pii_masker(data: any, **kwargs) -> any:
if isinstance(data, str):
return re.sub(r"[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+", "[EMAIL_REDACTED]", data)
elif isinstance(data, dict):
return {k: pii_masker(data=v) for k, v in data.items()}
elif isinstance(data, list):
return [pii_masker(data=item) for item in data]
return data
langfuse = Langfuse(mask=pii_masker)Logging & debugging
The Langfuse SDK can expose detailed logging and debugging information to help you troubleshoot issues with your application.
In code:
The Langfuse SDK uses Python’s standard logging module. The main logger is named "langfuse".
To enable detailed debug logging, you can either:
- Set the
debug=Trueparameter when initializing theLangfuseclient. - Configure the
"langfuse"logger manually:
import logging
langfuse_logger = logging.getLogger("langfuse")
langfuse_logger.setLevel(logging.DEBUG)The default log level for the langfuse logger is logging.WARNING.
Via environment variable:
You can also set the log level using the LANGFUSE_DEBUG environment variable to enable the debug mode.
export LANGFUSE_DEBUG="True"Sampling
Sampling lets send only a subset of traces to Langfuse. This is useful to reduce costs and noise in high-volume applications.
In code:
You can configure the SDK to sample traces by setting the sample_rate parameter during client initialization. This value should be a float between 0.0 (sample 0% of traces) and 1.0 (sample 100% of traces).
If a trace is not sampled, none of its observations (spans, generations) or associated scores will be sent to Langfuse.
from langfuse import Langfuse
# Sample approximately 20% of traces
langfuse_sampled = Langfuse(sample_rate=0.2)Via environment variable:
You can also set the sample rate using the LANGFUSE_SAMPLE_RATE environment variable.
export LANGFUSE_SAMPLE_RATE="0.2"Isolated TracerProvider
You can configure a separate OpenTelemetry TracerProvider for use with Langfuse. This creates isolation between Langfuse tracing and your other observability systems.
Benefits of isolation:
- Langfuse spans won’t be sent to your other observability backends (e.g., Datadog, Jaeger, Zipkin)
- Third-party library spans won’t be sent to Langfuse
- Independent configuration and sampling rates
While TracerProviders are isolated, they share the same OpenTelemetry context for tracking active spans. This can cause span relationship issues where:
- A parent span from one TracerProvider might have children from another TracerProvider
- Some spans may appear “orphaned” if their parent spans belong to a different TracerProvider
- Trace hierarchies may be incomplete or confusing
Plan your instrumentation carefully to avoid confusing trace structures.
from opentelemetry.sdk.trace import TracerProvider
from langfuse import Langfuse
langfuse_tracer_provider = TracerProvider() # do not set to global tracer provider to keep isolation
langfuse = Langfuse(tracer_provider=langfuse_tracer_provider)
langfuse.start_observation(name="myspan").end() # Span will be isolated from remaining OTEL instrumentationYou can read more about using Langfuse with an existing OpenTelemetry setup here.
Multi-project setups
Multi-project setups are experimental in the Python SDK and have important limitations regarding third-party OpenTelemetry integrations.
The Langfuse Python SDK supports routing traces to different projects within the same application by using multiple public keys. This works because the Langfuse SDK adds a specific span attribute containing the public key to all spans it generates.
How it works:
- Span Attributes: The Langfuse SDK adds a specific span attribute containing the public key to spans it creates
- Multiple Processors: Multiple span processors are registered onto the global tracer provider, each with their respective exporters bound to a specific public key
- Filtering: Within each span processor, spans are filtered based on the presence and value of the public key attribute
Important Limitation with Third-Party Libraries:
Third-party libraries that emit OpenTelemetry spans automatically (e.g., HTTP clients, databases, other instrumentation libraries) do not have the Langfuse public key span attribute. As a result:
- Third-party spans that pass the export filter and have no public key cannot be routed to a specific project
- These spans are processed by all span processors and can be sent to all projects
- With the default filter, this mainly affects GenAI/LLM spans from third-party instrumentors (infrastructure spans are typically filtered out)
Why is this experimental?
This approach requires that the public_key parameter be passed to all Langfuse SDK executions across all integrations to ensure proper routing, and third-party spans that pass filtering may appear in all projects.
Initialization
To set up multiple projects, initialize separate Langfuse clients for each project:
from langfuse import Langfuse
# Initialize clients for different projects
project_a_client = Langfuse(
public_key="pk-lf-project-a-...",
secret_key="sk-lf-project-a-...",
base_url="https://cloud.langfuse.com"
)
project_b_client = Langfuse(
public_key="pk-lf-project-b-...",
secret_key="sk-lf-project-b-...",
base_url="https://cloud.langfuse.com"
)Integration Usage
For all integrations in multi-project setups, you must specify the public_key parameter to ensure traces are routed to the correct project.
Observe Decorator:
Pass langfuse_public_key as a keyword argument to the top-most observed function (not the decorator). From Python SDK >= 3.2.2, nested decorated functions will automatically pick up the public key from the execution context they are currently into. Also, calls to get_client will be also aware of the current langfuse_public_key in the decorated function execution context, so passing the langfuse_public_key here again is not necessary.
from langfuse import observe
@observe
def nested():
# get_client call is context aware
# if it runs inside another decorated function that has
# langfuse_public_key passed, it does not need passing here again
@observe
def process_data_for_project_a(data):
# passing `langfuse_public_key` here again is not necessarily
# as it is stored in execution context
nested()
return {"processed": data}
@observe
def process_data_for_project_b(data):
# passing `langfuse_public_key` here again is not necessarily
# as it is stored in execution context
nested()
return {"enhanced": data}
# Route to Project A
# Top-most decorated function needs `langfuse_public_key` kwarg
result_a = process_data_for_project_a(
data="input data",
langfuse_public_key="pk-lf-project-a-..."
)
# Route to Project B
# Top-most decorated function needs `langfuse_public_key` kwarg
result_b = process_data_for_project_b(
data="input data",
langfuse_public_key="pk-lf-project-b-..."
)OpenAI Integration:
Add langfuse_public_key as a keyword argument to the OpenAI execution:
from langfuse.openai import openai
client = openai.OpenAI()
# Route to Project A
response_a = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello from Project A"}],
langfuse_public_key="pk-lf-project-a-..."
)
# Route to Project B
response_b = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello from Project B"}],
langfuse_public_key="pk-lf-project-b-..."
)Langchain Integration:
Add public_key to the CallbackHandler constructor:
from langfuse.langchain import CallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
# Create handlers for different projects
handler_a = CallbackHandler(public_key="pk-lf-project-a-...")
handler_b = CallbackHandler(public_key="pk-lf-project-b-...")
llm = ChatOpenAI(model_name="gpt-4o")
prompt = ChatPromptTemplate.from_template("Tell me about {topic}")
chain = prompt | llm
# Route to Project A
response_a = chain.invoke(
{"topic": "machine learning"},
config={"callbacks": [handler_a]}
)
# Route to Project B
response_b = chain.invoke(
{"topic": "data science"},
config={"callbacks": [handler_b]}
)Important Considerations:
- Every Langfuse SDK execution across all integrations must include the appropriate public key parameter
- Missing public key parameters may result in traces being routed to the default project or lost
- Third-party OpenTelemetry spans that pass filtering may appear in all projects since they lack the Langfuse public key attribute
Time to first token (TTFT)
You can manually set the time to first token (TTFT) of your LLM calls. This is useful for measuring the latency of your LLM calls and for identifying slow LLM calls.
You can use the completion_start_time attribute to manually set the time to first token (TTFT) of your LLM calls. This is useful for measuring the latency of your LLM calls and for identifying slow LLM calls.
from langfuse import get_client
import datetime, time
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="generation", name="TTFT-Generation") as generation:
time.sleep(3)
generation.update(
completion_start_time=datetime.datetime.now(),
output="some response",
)
langfuse.flush()Self-signed SSL certificates (self-hosted Langfuse)
If you are self-hosting Langfuse and you’d like to use self-signed SSL certificates, you will need to configure the SDK to trust the self-signed certificate:
Changing SSL settings has major security implications depending on your environment. Be sure you understand these implications before you proceed.
1. Set OpenTelemetry span exporter to trust self-signed certificate
OTEL_EXPORTER_OTLP_TRACES_CERTIFICATE="/path/to/my-selfsigned-cert.crt"2. Set HTTPX to trust certificate for all other API requests to Langfuse instance
import os
import httpx
from langfuse import Langfuse
httpx_client = httpx.Client(verify=os.environ["OTEL_EXPORTER_OTLP_TRACES_CERTIFICATE"])
langfuse = Langfuse(httpx_client=httpx_client)Setup with Sentry
If you’re using both Sentry and Langfuse in your application, you’ll need to configure a custom OpenTelemetry setup since both tools use OpenTelemetry for tracing. This guide shows how to send error monitoring data to Sentry while simultaneously capturing LLM observability traces in Langfuse.
Thread pools and multiprocessing
Use the OpenTelemetry threading instrumentor so context flows across worker threads.
from opentelemetry.instrumentation.threading import ThreadingInstrumentor
ThreadingInstrumentor().instrument()For multiprocessing, follow the OpenTelemetry guidance. If you use Pydantic Logfire, enable distributed_tracing=True.