通义千问1.5-1.8B-Chat-GPTQ-Int4与SpringBoot微服务集成实战1. 引言最近在做一个智能客服项目需要集成大语言模型来处理用户咨询。通义千问1.5-1.8B-Chat-GPTQ-Int4这个版本特别适合我们这种对响应速度有要求的场景毕竟量化后的模型体积小、推理快还能保持不错的对话质量。SpringBoot作为Java领域最流行的微服务框架与AI模型的结合越来越普遍。但在实际集成过程中我发现很多开发者会遇到接口调用、性能优化、异常处理等问题。本文将分享我在项目中的实战经验帮你快速实现一个稳定高效的智能问答服务。2. 环境准备与项目搭建2.1 基础环境要求在开始之前确保你的开发环境满足以下要求JDK 11或更高版本Maven 3.6 或 Gradle 7.xSpringBoot 2.7 或 3.x通义千问模型API访问权限2.2 创建SpringBoot项目使用Spring Initializr快速创建项目基础结构curl https://start.spring.io/starter.zip -d dependenciesweb,validation -d typemaven-project -d languagejava -d bootVersion3.2.0 -d baseDirai-service -d packageNamecom.example.ai -d nameai-service -o ai-service.zip解压后得到标准的SpringBoot项目结构我们将在基础上添加AI集成相关代码。3. 核心服务层设计3.1 API客户端封装首先创建通义千问API的客户端封装这是与模型交互的核心组件Component public class TongyiQianwenClient { private static final String API_URL https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation; Value(${tongyi.api-key}) private String apiKey; private final RestTemplate restTemplate; public TongyiQianwenClient(RestTemplateBuilder restTemplateBuilder) { this.restTemplate restTemplateBuilder.build(); } public String generateResponse(String prompt) { HttpHeaders headers new HttpHeaders(); headers.setContentType(MediaType.APPLICATION_JSON); headers.set(Authorization, Bearer apiKey); MapString, Object requestBody new HashMap(); requestBody.put(model, qwen-1.8b-chat); MapString, Object input new HashMap(); input.put(prompt, prompt); requestBody.put(input, input); MapString, Object parameters new HashMap(); parameters.put(temperature, 0.7); parameters.put(top_p, 0.9); requestBody.put(parameters, parameters); HttpEntityMapString, Object entity new HttpEntity(requestBody, headers); try { ResponseEntityMap response restTemplate.postForEntity(API_URL, entity, Map.class); return extractResponseText(response.getBody()); } catch (Exception e) { throw new RuntimeException(调用通义千问API失败, e); } } private String extractResponseText(MapString, Object response) { // 解析API响应提取生成的文本 if (response ! null response.containsKey(output)) { MapString, Object output (MapString, Object) response.get(output); if (output ! null output.containsKey(text)) { return (String) output.get(text); } } throw new RuntimeException(解析API响应失败); } }3.2 服务层实现创建业务服务层处理具体的问答逻辑Service Slf4j public class AIService { private final TongyiQianwenClient tongyiClient; public AIService(TongyiQianwenClient tongyiClient) { this.tongyiClient tongyiClient; } public String processQuery(String query) { try { // 构建适合模型的提示词 String prompt buildPrompt(query); log.info(发送提示词: {}, prompt); // 调用模型生成响应 String response tongyiClient.generateResponse(prompt); log.info(收到响应: {}, response); // 后处理响应内容 return postProcessResponse(response); } catch (Exception e) { log.error(处理查询时发生错误, e); return 抱歉暂时无法处理您的请求请稍后再试。; } } private String buildPrompt(String query) { // 根据实际场景构建合适的提示词 return String.format(你是一个有帮助的AI助手。请用中文回答以下问题%s, query); } private String postProcessResponse(String response) { // 清理响应内容移除多余标记等 return response.replaceAll(\\n, \n).trim(); } }4. 接口层设计与实现4.1 RESTful API设计创建控制器层提供简洁的API接口RestController RequestMapping(/api/ai) Validated public class AIController { private final AIService aiService; public AIController(AIService aiService) { this.aiService aiService; } PostMapping(/chat) public ResponseEntityApiResponseString chat( RequestBody Valid ChatRequest request) { try { String response aiService.processQuery(request.getMessage()); return ResponseEntity.ok(ApiResponse.success(response)); } catch (Exception e) { return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR) .body(ApiResponse.error(处理请求时发生错误)); } } GetMapping(/health) public ResponseEntityApiResponseString healthCheck() { return ResponseEntity.ok(ApiResponse.success(服务正常运行)); } }4.2 请求响应对象定义清晰的DTO对象Data NoArgsConstructor AllArgsConstructor public class ChatRequest { NotBlank(message 消息内容不能为空) Size(max 1000, message 消息长度不能超过1000字符) private String message; private String conversationId; } Data NoArgsConstructor AllArgsConstructor public class ApiResponseT { private boolean success; private String message; private T data; private long timestamp; public static T ApiResponseT success(T data) { return new ApiResponse(true, 成功, data, System.currentTimeMillis()); } public static T ApiResponseT error(String message) { return new ApiResponse(false, message, null, System.currentTimeMillis()); } }5. 性能优化与实践5.1 连接池配置优化HTTP连接池提高API调用效率# application.yml tongyi: api-key: your-api-key-here connection: timeout: 5000 read-timeout: 30000 max-connections: 100 max-per-route: 50对应的配置类Configuration public class RestTemplateConfig { Value(${tongyi.connection.timeout:5000}) private int connectionTimeout; Value(${tongyi.connection.read-timeout:30000}) private int readTimeout; Value(${tongyi.connection.max-connections:100}) private int maxConnections; Value(${tongyi.connection.max-per-route:50}) private int maxPerRoute; Bean public RestTemplate restTemplate(RestTemplateBuilder builder) { return builder .requestFactory(this::requestFactory) .setConnectTimeout(Duration.ofMillis(connectionTimeout)) .setReadTimeout(Duration.ofMillis(readTimeout)) .build(); } private ClientHttpRequestFactory requestFactory() { PoolingHttpClientConnectionManager connectionManager new PoolingHttpClientConnectionManager(); connectionManager.setMaxTotal(maxConnections); connectionManager.setDefaultMaxPerRoute(maxPerRoute); RequestConfig requestConfig RequestConfig.custom() .setConnectTimeout(connectionTimeout) .setSocketTimeout(readTimeout) .build(); CloseableHttpClient httpClient HttpClients.custom() .setConnectionManager(connectionManager) .setDefaultRequestConfig(requestConfig) .build(); return new HttpComponentsClientHttpRequestFactory(httpClient); } }5.2 缓存策略实现简单的响应缓存减少重复请求Component Slf4j public class ResponseCache { private final CacheString, String cache; public ResponseCache() { this.cache Caffeine.newBuilder() .expireAfterWrite(30, TimeUnit.MINUTES) .maximumSize(1000) .build(); } public String getCachedResponse(String key) { return cache.getIfPresent(key); } public void cacheResponse(String key, String response) { cache.put(key, response); } public void invalidateKey(String key) { cache.invalidate(key); } }在服务层集成缓存Service Slf4j public class CachedAIService { private final AIService aiService; private final ResponseCache responseCache; public CachedAIService(AIService aiService, ResponseCache responseCache) { this.aiService aiService; this.responseCache responseCache; } public String processQueryWithCache(String query) { // 生成缓存键 String cacheKey generateCacheKey(query); // 检查缓存 String cachedResponse responseCache.getCachedResponse(cacheKey); if (cachedResponse ! null) { log.info(缓存命中: {}, cacheKey); return cachedResponse; } // 调用原始服务 String response aiService.processQuery(query); // 缓存响应 responseCache.cacheResponse(cacheKey, response); return response; } private String generateCacheKey(String query) { return DigestUtils.md5DigestAsHex(query.getBytes()); } }6. 异常处理与监控6.1 全局异常处理实现统一的异常处理机制RestControllerAdvice Slf4j public class GlobalExceptionHandler { ExceptionHandler(Exception.class) public ResponseEntityApiResponseString handleAllExceptions(Exception ex) { log.error(全局异常捕获: , ex); return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR) .body(ApiResponse.error(系统繁忙请稍后再试)); } ExceptionHandler(MethodArgumentNotValidException.class) public ResponseEntityApiResponseString handleValidationExceptions( MethodArgumentNotValidException ex) { String errorMessage ex.getBindingResult().getFieldErrors().stream() .map(FieldError::getDefaultMessage) .collect(Collectors.joining(, )); return ResponseEntity.badRequest() .body(ApiResponse.error(errorMessage)); } ExceptionHandler(HttpClientErrorException.class) public ResponseEntityApiResponseString handleHttpClientError( HttpClientErrorException ex) { log.warn(HTTP客户端错误: {}, ex.getMessage()); return ResponseEntity.status(ex.getStatusCode()) .body(ApiResponse.error(API调用失败: ex.getStatusText())); } }6.2 监控与日志配置详细的日志记录和监控Component Slf4j public class AIMetrics { private final MeterRegistry meterRegistry; public AIMetrics(MeterRegistry meterRegistry) { this.meterRegistry meterRegistry; } public void recordApiCall(boolean success, long duration) { meterRegistry.counter(ai.api.calls, success, String.valueOf(success)) .increment(); meterRegistry.timer(ai.api.latency) .record(duration, TimeUnit.MILLISECONDS); } public void recordCacheHit(boolean hit) { meterRegistry.counter(ai.cache.hits, hit, String.valueOf(hit)) .increment(); } }在服务层集成监控Service Slf4j public class MonitoredAIService { private final AIService aiService; private final AIMetrics aiMetrics; public MonitoredAIService(AIService aiService, AIMetrics aiMetrics) { this.aiService aiService; this.aiMetrics aiMetrics; } public String processQueryWithMonitoring(String query) { long startTime System.currentTimeMillis(); boolean success false; try { String response aiService.processQuery(query); success true; return response; } finally { long duration System.currentTimeMillis() - startTime; aiMetrics.recordApiCall(success, duration); } } }7. 测试与验证7.1 单元测试编写服务层的单元测试ExtendWith(MockitoExtension.class) class AIServiceTest { Mock private TongyiQianwenClient tongyiClient; InjectMocks private AIService aiService; Test void shouldProcessQuerySuccessfully() { // 准备测试数据 String testQuery 你好介绍一下你自己; String expectedResponse 我是通义千问AI助手...; // 模拟依赖行为 when(tongyiClient.generateResponse(anyString())) .thenReturn(expectedResponse); // 执行测试 String actualResponse aiService.processQuery(testQuery); // 验证结果 assertThat(actualResponse).isEqualTo(expectedResponse); verify(tongyiClient).generateResponse(contains(testQuery)); } Test void shouldHandleApiFailureGracefully() { // 模拟API调用失败 when(tongyiClient.generateResponse(anyString())) .thenThrow(new RuntimeException(API调用失败)); // 执行测试并验证异常处理 String response aiService.processQuery(test); assertThat(response).isEqualTo(抱歉暂时无法处理您的请求请稍后再试。); } }7.2 集成测试编写API层的集成测试SpringBootTest(webEnvironment SpringBootTest.WebEnvironment.RANDOM_PORT) AutoConfigureMockMvc class AIControllerIntegrationTest { Autowired private MockMvc mockMvc; MockBean private AIService aiService; Test void shouldReturnSuccessResponse() throws Exception { // 准备测试数据 ChatRequest request new ChatRequest(你好, conv123); String expectedResponse 你好我是AI助手; // 模拟服务层行为 when(aiService.processQuery(你好)) .thenReturn(expectedResponse); // 执行API调用 mockMvc.perform(post(/api/ai/chat) .contentType(MediaType.APPLICATION_JSON) .content({\message\:\你好\,\conversationId\:\conv123\})) // 验证响应 .andExpect(status().isOk()) .andExpect(jsonPath($.success).value(true)) .andExpect(jsonPath($.data).value(expectedResponse)); } }8. 总结通过这个实战项目我们成功将通义千问1.5-1.8B-Chat-GPTQ-Int4模型集成到了SpringBoot微服务中。整个集成过程相对 straightforward关键是处理好API调用、异常处理和性能优化这几个环节。在实际使用中这个方案表现稳定响应速度也满足业务需求。量化后的模型在保持较好对话质量的同时显著降低了资源消耗特别适合中小规模的智能问答场景。如果你正在考虑类似的集成方案建议先从简单的问答功能开始逐步扩展缓存、监控等高级特性。记得做好异常处理和降级方案确保服务的稳定性。后续还可以考虑加入对话历史管理、多轮对话支持等功能让智能问答体验更加完善。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。
通义千问1.5-1.8B-Chat-GPTQ-Int4与SpringBoot微服务集成实战
发布时间:2026/5/27 3:22:18
通义千问1.5-1.8B-Chat-GPTQ-Int4与SpringBoot微服务集成实战1. 引言最近在做一个智能客服项目需要集成大语言模型来处理用户咨询。通义千问1.5-1.8B-Chat-GPTQ-Int4这个版本特别适合我们这种对响应速度有要求的场景毕竟量化后的模型体积小、推理快还能保持不错的对话质量。SpringBoot作为Java领域最流行的微服务框架与AI模型的结合越来越普遍。但在实际集成过程中我发现很多开发者会遇到接口调用、性能优化、异常处理等问题。本文将分享我在项目中的实战经验帮你快速实现一个稳定高效的智能问答服务。2. 环境准备与项目搭建2.1 基础环境要求在开始之前确保你的开发环境满足以下要求JDK 11或更高版本Maven 3.6 或 Gradle 7.xSpringBoot 2.7 或 3.x通义千问模型API访问权限2.2 创建SpringBoot项目使用Spring Initializr快速创建项目基础结构curl https://start.spring.io/starter.zip -d dependenciesweb,validation -d typemaven-project -d languagejava -d bootVersion3.2.0 -d baseDirai-service -d packageNamecom.example.ai -d nameai-service -o ai-service.zip解压后得到标准的SpringBoot项目结构我们将在基础上添加AI集成相关代码。3. 核心服务层设计3.1 API客户端封装首先创建通义千问API的客户端封装这是与模型交互的核心组件Component public class TongyiQianwenClient { private static final String API_URL https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation; Value(${tongyi.api-key}) private String apiKey; private final RestTemplate restTemplate; public TongyiQianwenClient(RestTemplateBuilder restTemplateBuilder) { this.restTemplate restTemplateBuilder.build(); } public String generateResponse(String prompt) { HttpHeaders headers new HttpHeaders(); headers.setContentType(MediaType.APPLICATION_JSON); headers.set(Authorization, Bearer apiKey); MapString, Object requestBody new HashMap(); requestBody.put(model, qwen-1.8b-chat); MapString, Object input new HashMap(); input.put(prompt, prompt); requestBody.put(input, input); MapString, Object parameters new HashMap(); parameters.put(temperature, 0.7); parameters.put(top_p, 0.9); requestBody.put(parameters, parameters); HttpEntityMapString, Object entity new HttpEntity(requestBody, headers); try { ResponseEntityMap response restTemplate.postForEntity(API_URL, entity, Map.class); return extractResponseText(response.getBody()); } catch (Exception e) { throw new RuntimeException(调用通义千问API失败, e); } } private String extractResponseText(MapString, Object response) { // 解析API响应提取生成的文本 if (response ! null response.containsKey(output)) { MapString, Object output (MapString, Object) response.get(output); if (output ! null output.containsKey(text)) { return (String) output.get(text); } } throw new RuntimeException(解析API响应失败); } }3.2 服务层实现创建业务服务层处理具体的问答逻辑Service Slf4j public class AIService { private final TongyiQianwenClient tongyiClient; public AIService(TongyiQianwenClient tongyiClient) { this.tongyiClient tongyiClient; } public String processQuery(String query) { try { // 构建适合模型的提示词 String prompt buildPrompt(query); log.info(发送提示词: {}, prompt); // 调用模型生成响应 String response tongyiClient.generateResponse(prompt); log.info(收到响应: {}, response); // 后处理响应内容 return postProcessResponse(response); } catch (Exception e) { log.error(处理查询时发生错误, e); return 抱歉暂时无法处理您的请求请稍后再试。; } } private String buildPrompt(String query) { // 根据实际场景构建合适的提示词 return String.format(你是一个有帮助的AI助手。请用中文回答以下问题%s, query); } private String postProcessResponse(String response) { // 清理响应内容移除多余标记等 return response.replaceAll(\\n, \n).trim(); } }4. 接口层设计与实现4.1 RESTful API设计创建控制器层提供简洁的API接口RestController RequestMapping(/api/ai) Validated public class AIController { private final AIService aiService; public AIController(AIService aiService) { this.aiService aiService; } PostMapping(/chat) public ResponseEntityApiResponseString chat( RequestBody Valid ChatRequest request) { try { String response aiService.processQuery(request.getMessage()); return ResponseEntity.ok(ApiResponse.success(response)); } catch (Exception e) { return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR) .body(ApiResponse.error(处理请求时发生错误)); } } GetMapping(/health) public ResponseEntityApiResponseString healthCheck() { return ResponseEntity.ok(ApiResponse.success(服务正常运行)); } }4.2 请求响应对象定义清晰的DTO对象Data NoArgsConstructor AllArgsConstructor public class ChatRequest { NotBlank(message 消息内容不能为空) Size(max 1000, message 消息长度不能超过1000字符) private String message; private String conversationId; } Data NoArgsConstructor AllArgsConstructor public class ApiResponseT { private boolean success; private String message; private T data; private long timestamp; public static T ApiResponseT success(T data) { return new ApiResponse(true, 成功, data, System.currentTimeMillis()); } public static T ApiResponseT error(String message) { return new ApiResponse(false, message, null, System.currentTimeMillis()); } }5. 性能优化与实践5.1 连接池配置优化HTTP连接池提高API调用效率# application.yml tongyi: api-key: your-api-key-here connection: timeout: 5000 read-timeout: 30000 max-connections: 100 max-per-route: 50对应的配置类Configuration public class RestTemplateConfig { Value(${tongyi.connection.timeout:5000}) private int connectionTimeout; Value(${tongyi.connection.read-timeout:30000}) private int readTimeout; Value(${tongyi.connection.max-connections:100}) private int maxConnections; Value(${tongyi.connection.max-per-route:50}) private int maxPerRoute; Bean public RestTemplate restTemplate(RestTemplateBuilder builder) { return builder .requestFactory(this::requestFactory) .setConnectTimeout(Duration.ofMillis(connectionTimeout)) .setReadTimeout(Duration.ofMillis(readTimeout)) .build(); } private ClientHttpRequestFactory requestFactory() { PoolingHttpClientConnectionManager connectionManager new PoolingHttpClientConnectionManager(); connectionManager.setMaxTotal(maxConnections); connectionManager.setDefaultMaxPerRoute(maxPerRoute); RequestConfig requestConfig RequestConfig.custom() .setConnectTimeout(connectionTimeout) .setSocketTimeout(readTimeout) .build(); CloseableHttpClient httpClient HttpClients.custom() .setConnectionManager(connectionManager) .setDefaultRequestConfig(requestConfig) .build(); return new HttpComponentsClientHttpRequestFactory(httpClient); } }5.2 缓存策略实现简单的响应缓存减少重复请求Component Slf4j public class ResponseCache { private final CacheString, String cache; public ResponseCache() { this.cache Caffeine.newBuilder() .expireAfterWrite(30, TimeUnit.MINUTES) .maximumSize(1000) .build(); } public String getCachedResponse(String key) { return cache.getIfPresent(key); } public void cacheResponse(String key, String response) { cache.put(key, response); } public void invalidateKey(String key) { cache.invalidate(key); } }在服务层集成缓存Service Slf4j public class CachedAIService { private final AIService aiService; private final ResponseCache responseCache; public CachedAIService(AIService aiService, ResponseCache responseCache) { this.aiService aiService; this.responseCache responseCache; } public String processQueryWithCache(String query) { // 生成缓存键 String cacheKey generateCacheKey(query); // 检查缓存 String cachedResponse responseCache.getCachedResponse(cacheKey); if (cachedResponse ! null) { log.info(缓存命中: {}, cacheKey); return cachedResponse; } // 调用原始服务 String response aiService.processQuery(query); // 缓存响应 responseCache.cacheResponse(cacheKey, response); return response; } private String generateCacheKey(String query) { return DigestUtils.md5DigestAsHex(query.getBytes()); } }6. 异常处理与监控6.1 全局异常处理实现统一的异常处理机制RestControllerAdvice Slf4j public class GlobalExceptionHandler { ExceptionHandler(Exception.class) public ResponseEntityApiResponseString handleAllExceptions(Exception ex) { log.error(全局异常捕获: , ex); return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR) .body(ApiResponse.error(系统繁忙请稍后再试)); } ExceptionHandler(MethodArgumentNotValidException.class) public ResponseEntityApiResponseString handleValidationExceptions( MethodArgumentNotValidException ex) { String errorMessage ex.getBindingResult().getFieldErrors().stream() .map(FieldError::getDefaultMessage) .collect(Collectors.joining(, )); return ResponseEntity.badRequest() .body(ApiResponse.error(errorMessage)); } ExceptionHandler(HttpClientErrorException.class) public ResponseEntityApiResponseString handleHttpClientError( HttpClientErrorException ex) { log.warn(HTTP客户端错误: {}, ex.getMessage()); return ResponseEntity.status(ex.getStatusCode()) .body(ApiResponse.error(API调用失败: ex.getStatusText())); } }6.2 监控与日志配置详细的日志记录和监控Component Slf4j public class AIMetrics { private final MeterRegistry meterRegistry; public AIMetrics(MeterRegistry meterRegistry) { this.meterRegistry meterRegistry; } public void recordApiCall(boolean success, long duration) { meterRegistry.counter(ai.api.calls, success, String.valueOf(success)) .increment(); meterRegistry.timer(ai.api.latency) .record(duration, TimeUnit.MILLISECONDS); } public void recordCacheHit(boolean hit) { meterRegistry.counter(ai.cache.hits, hit, String.valueOf(hit)) .increment(); } }在服务层集成监控Service Slf4j public class MonitoredAIService { private final AIService aiService; private final AIMetrics aiMetrics; public MonitoredAIService(AIService aiService, AIMetrics aiMetrics) { this.aiService aiService; this.aiMetrics aiMetrics; } public String processQueryWithMonitoring(String query) { long startTime System.currentTimeMillis(); boolean success false; try { String response aiService.processQuery(query); success true; return response; } finally { long duration System.currentTimeMillis() - startTime; aiMetrics.recordApiCall(success, duration); } } }7. 测试与验证7.1 单元测试编写服务层的单元测试ExtendWith(MockitoExtension.class) class AIServiceTest { Mock private TongyiQianwenClient tongyiClient; InjectMocks private AIService aiService; Test void shouldProcessQuerySuccessfully() { // 准备测试数据 String testQuery 你好介绍一下你自己; String expectedResponse 我是通义千问AI助手...; // 模拟依赖行为 when(tongyiClient.generateResponse(anyString())) .thenReturn(expectedResponse); // 执行测试 String actualResponse aiService.processQuery(testQuery); // 验证结果 assertThat(actualResponse).isEqualTo(expectedResponse); verify(tongyiClient).generateResponse(contains(testQuery)); } Test void shouldHandleApiFailureGracefully() { // 模拟API调用失败 when(tongyiClient.generateResponse(anyString())) .thenThrow(new RuntimeException(API调用失败)); // 执行测试并验证异常处理 String response aiService.processQuery(test); assertThat(response).isEqualTo(抱歉暂时无法处理您的请求请稍后再试。); } }7.2 集成测试编写API层的集成测试SpringBootTest(webEnvironment SpringBootTest.WebEnvironment.RANDOM_PORT) AutoConfigureMockMvc class AIControllerIntegrationTest { Autowired private MockMvc mockMvc; MockBean private AIService aiService; Test void shouldReturnSuccessResponse() throws Exception { // 准备测试数据 ChatRequest request new ChatRequest(你好, conv123); String expectedResponse 你好我是AI助手; // 模拟服务层行为 when(aiService.processQuery(你好)) .thenReturn(expectedResponse); // 执行API调用 mockMvc.perform(post(/api/ai/chat) .contentType(MediaType.APPLICATION_JSON) .content({\message\:\你好\,\conversationId\:\conv123\})) // 验证响应 .andExpect(status().isOk()) .andExpect(jsonPath($.success).value(true)) .andExpect(jsonPath($.data).value(expectedResponse)); } }8. 总结通过这个实战项目我们成功将通义千问1.5-1.8B-Chat-GPTQ-Int4模型集成到了SpringBoot微服务中。整个集成过程相对 straightforward关键是处理好API调用、异常处理和性能优化这几个环节。在实际使用中这个方案表现稳定响应速度也满足业务需求。量化后的模型在保持较好对话质量的同时显著降低了资源消耗特别适合中小规模的智能问答场景。如果你正在考虑类似的集成方案建议先从简单的问答功能开始逐步扩展缓存、监控等高级特性。记得做好异常处理和降级方案确保服务的稳定性。后续还可以考虑加入对话历史管理、多轮对话支持等功能让智能问答体验更加完善。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。